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Aralarl, syntaxinIA or cpo homologous proteins involved in the 
regulation of energy homeostasis 



Description 



This invention relates to the use of nucleic acid sequences encoding 
aralarl, syntaxinIA, or cpo homologous proteins, and the polypeptides 
encoded thereby and to the use thereof in the diagnosis, study, prevention, 
and treatment of diseases and disorders related to body-weight regulation, 
for example, but not limited to, metabolic diseases such as obesity as well 
as related disorders such as eating disorder, cachexia, diabetes mellitus, 
hypertension, coronary heart disease, hypercholesterolemia, dyslipidemia, 
osteoarthritis, gallstones, cancer, e.g. cancers of the reproductive organs, 
and sleep apnea. 

Obesity is one of the most prevalent metabolic disorders in the world. It is 
still poorly understood human disease that becomes more and more 
relevant for western society. Obesity is defined as an excess of body fat, 
frequently resulting in a significant impairment of health. Besides severe 
risks of illness such as diabetes, hypertension and heart disease, 
individuals suffering from obesity are often isolated socially. Human obesity 
is strongly influenced by environmental and genetic factors, whereby the 
environmental influence is often a hurdle for the identification of (human) 
obesity genes. Obesity is influenced by genetic, metabolic, biochemical, 
psychological, and behavioral factors. As such, it is a complex disorder 
that must be addressed on several fronts to achieve lasting positive clinical 
outcome. Obese individuals are prone to ailments including: diabetes 
mellitus, hypertension, coronary heart disease, hypercholesterolemia, 
dyslipidemia, osteoarthritis, gallstones, cancers of the reproductive organs, 
and sleep apnea. 



Obesity is not to be considered as a single disorder but a heterogeneous 
group of conditions with (potential) multiple causes. Obesity is also 
characterized by elevated fasting plasma insulin and an exaggerated insulin 
response to oral glucose intake (Koltermann, J. Clin. Invest 65, 1980, 
1272-1284) and a clear involvement of obesity in type 2 diabetes mellitus 
can be confirmed (Kopelman, Nature 404, 2000, 635-643). 

Even if several candidate genes have been described which are supposed 
to influence the homeostatic system(s) that regulate body mass/weight, 
like leptin, VCPI, VCPL, or the peroxisome proliferator-activated 
receptor-gamma co-activator, the distinct molecular mechanisms and/or 
molecules influencing obesity or body weight/body mass regulations are 
not known. 

Therefore, the technical problem underlying the present invention was to 
provide for means and methods for modulating (pathological) metabolic 
conditions influencing body-weight regulation and/or energy homeostatic 
circuits. The solution to said technical problem is achieved by providing the 
embodiments characterized in-the claims. 

Accordingly, the present invention relates to genes with novel functions in 
body-weight regulation, energy homeostasis, metabolism, and obesity. The 
present invention discloses specific genes involved in the regulation of 
body-weight, energy homeostasis, metabolism, and obesity, and thus in 
disorders related thereto such as eating disorder, cachexia, diabetes 
mellitus, hypertension, coronary heart disease, hypercholesterolemia, 
dyslipidemia, osteoarthritis, gallstones, cancer, e.g. cancers of the 
reproductive organs, and sleep apnea. The present invention describes the 
human aralarl, syntaxinl A, or cpo homologous genes as being involved in 
those conditions mentioned above. 
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The term 'GenBank Accession number' relates to NCBI GenBank database 
entries (Benson et al, Nucleic Acids Res. 28, 2000, 15-18). 

Energy transduction in mitochondria requires the transport of many specific 
metabolites across the inner membrane of this eukaryotic organelle. The 
mitochondrial carrier family (MCF) consists of at least thirty-seven proteins. 
(Kuan J. and Saier M.H., 1993, Crit Rev Biochem Mol Biol 28(3):209-233). 
The mitochondrial aspartate/glutamate carrier catalyzes an important step 
in both the urea cycle and the aspartate/malate NADH shuttle. Citrin and 
aralar! are homologous proteins belonging to the mitochondrial carrier 
family with EF-hand Ca(2 + )-binding motifs in their N-terminal domains. 
Citrin and aralar 1 are isoform Ca(2 + )-stimulated aspartate/glutamate 
transporters in mitochondria (Palmieri L. et al., 2001, EM BO J 
20(1 8): 5060-9). Solute carrier family 25, member 13 (SLC25A13) encodes 
a calcium-binding mitochondrial carrier protein, designated citrin. Mutations 
in the SLC25A13 gene lead to adult-onset type II citrullinemia (Yasuda T. 
et al., 2000, Hum Genet 107(6):537-545). 

Drosophila melanogaster gene syntaxinIA (SyxIA ) encodes a t-SNARE 
20 protein involved in neurotransmitter release which is localised to the 
synaptic vesicle. The protein is expressed in the adult (adult brain, lamina, 
medulla and neuropil), embryo (amnioproctodeal invagination, anterior 
embryonic/larval midgut, anterior midgut primordium, axon and other 
tissues), larva (axon, bouton, neuromuscular junction and synapse) and 
25 . ovary (germarium region 2a, germarium region 2b, germarium region 3, 
germline cyst and nurse cell) in Drosophila. There are 27 recorded mutant 
alleles have been described for SyxIA. Amorphic mutations have been 
isolated which affect the neuromuscular junction, the embryonic maternal 
effect cuticle, the embryonic maternal effect larval midgut and 1 1 other 
30 tissues and are embryonic lethal, embryonic recessive neurophysiology 
defective and embryonic recessive neuroanatomy defective. 
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The SNARE hypothesis predicts that a family of SNAP receptors are 
localized to and function in diverse intracellular membrane compartments 
where membrane fusion processes take place. Syntaxins, the prototype 
family of SNARE proteins, have a carboxy-terminal tail-anchor and multiple 
coiled-coil domains. There are 15 members of the syntaxin family in the 
human genome. In conjunction with other SNAREs and with the 
cytoplasmic NSF and SNAP proteins, syntaxins mediate vesicle fusion in 
diverse vesicular transport processes along the exocytic and the endocytic 
pathway. They are crucial components that both drive and provide 
specificity to the myriad vesicular fusion processes that characterize the 
eukaryotic cell (Teng F.Y. et al., 2001, Genome Biol 2(1 1): 3012). 

Syntaxins are a family of receptors for intracellular transport vesicles. Each 
target membrane may be identified by a specific member of the syntaxin 
family (Bennett et al., Cell 74: 863-873(1993)). Members of the syntaxin 
family have a size ranging from 30 kDa to 40 kDa and consist of a highly 
hydrophobic carboxy-terminal extremity that anchors the protein on the 
cytoplasmic surface of cellular membranes and of a central, well conserved 
region, which seems to be irna coiled-coil conformation (see, for example/ 
Spring et al, 1993, Trends Biochem. Sci. 18: 124-125; Pelham, 1993, Cell 
73: 425-426(1993). Mammalian syntaxins A and B are nervous 
system-specific proteins implicated in the docking of synaptic vesicles with 
the presynaptic plasma membrane. The process of vesicular fusion with 
target membranes depends on a set of SNAREs (SNAP-Receptors), which 
are associated with the fusing membranes (see, for example, Marxen et al., 
1997, Neurochem. Res. 22(8): 941-950; Hanson etal., 1997, Curr. Opin. 
Neurobiol. 7(3): 310-315; Weimbs et al., Proc. Natl. Acad. Sci. U.S.A. 
94(7): 3046-3051 (1 997)). Target SNAREs (t-SNAREs) are localized on the 
target membrane and belong to two different families, the syntaxin-like 
family and the SNAP-25 like family. One member of each family, together 
with a v-SNARE localized on the vesicular membrane, are required for 
fusion. 



Cysteine-string protein (Csp) is a secretory vesicle protein that functions in 
presynaptic neurotransmission and also in regulated exocytosis from 
non-neuronal cells. Csp1 is expressed in 3T3-L1 adipocytes and that 
cellular levels of this protein are increased following cell differentiation. It 
has been shown that syntaxin 1 A binds to both Csp isoforms, and actually 
exhibits a higher affinity for the Csp2 protein (Chamberlain et al., 2001, J 
Cell Sci 114(Pt 2):445-55). 

Syntaxin 1A is a candidate gene for Type II (non-insulin-dependent) 
diabetes mellitus because it plays an important role in insulin secretion 
from the islet beta cells. Decreased expression of t-SNARE, syntaxin 1 , and 
SNAP-25 in pancreatic beta-cells is involved in impaired insulin secretion 
from diabetic GK rat islets: restoration of decreased t-SNARE proteins 
improves impaired insulin secretion (see, Nagamatsu S et al. 1999, 
Diabetes 48(12):2367-73). Yang et al. (1999) have shown that syntaxin 1 
interacts with the L(D) subtype of voltage-gated Ca(2 + ) channels in 
pancreatic beta cells (see, Proc Natl Acad Sci USA 96(1 8): 101 64-9) It 
has been found that syntaxin 1 protein levels are decreased in brain of 
obese (ob/ob) and diabetic (db/db) mice and can be elevated to normal 
levels by application of leptin (Ahima et al. 1999 Endocrinology 1999 
Jun;140(6):2755-62). It has also been described by Tsunoda et al. that 
single nucleotide polymorphism (D68D, T to C) in the syntaxin 1A gene 
correlates to age at onset and insulin requirement in Type II diabetic 
Japanese patients (see Diabetologia 2001 44(1 1):2092-7) Beta-cell 
hypertrophy in fa/fa rats is associated with basal glucose hypersensitivity 
and reduced SNARE protein expression (Chan et al. 1999 Diabetes 
48(5):997-1005). 

The Drosophila melanogaster gene couch potato (cpo, GadFly Accession 
Number CG 18434) encodes a putative nuclear RNA binding protein. The 
protein is expressed in the Drosophila embryo (embryonic central nervous 
system, embryonic peripheral nervous system, embryonic/larval midgut, 
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glial cell and other tissues) (Harvie et al., 1998, Genetics 149(1): 
21 7-231). At least three protein isoforms (for example, Cpo 17, Cpo 61 .1 
and Cpo 61.2) and 49 recorded mutant alleles have been described. 
Mutations have been isolated which affect the larval ventral ganglion and 
are recessive lethal in Drosophila. Mutant cpo flies exhibit an abnormal and 
hypoactive behavior (Bellen et al., 1992, Genetics 131: 365-375, and 
Bellen et al., 1992, Genes Dev. 6: 2125-2136). This invention describes 
as human homolog proteins to the Drosophila cpo encoded gene product 
the RNA-binding protein gene with multiple splicing and a hypothetical 
protein XP_091097. No further information is available for the human 
homolog proteins from the prior art. 
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25 
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So far, it has not been described that aralarl, syntaxinIA, or cpo and 
homologous proteins are involved in the regulation of energy homeostasis 
and body-weight regulation and related disorders, and thus, no functions in 
metabolic diseases and other diseases as listed above have been 
discussed. In this invention we demonstrate that the correct gene dose of 
aralarl, syntaxinIA, or cpo is essential for maintenance of energy 
homeostasis. A genetic screen was used to identify that mutation of 
aralarl, syntaxinIA, or cpo homologous genes cause obesity, reflected by 
a significant change of triglyceride content, the major energy storage 
substance. The function of cpo in metabolic disorders is further validated 
by data obtained from additional screens. For example, an additional screen 
using Drosophila mutants with modifications of the eye phenotype 
identified an interaction of cpo with adipose, a protein regulating, causing 
or contributing to obesity. Furthermore, an additional screen using 
Drosophila mutants with modifications of the eye phenotype identified a 
modification of UCP activity by cpo, thereby leading to an altered 
mitochondrial activity. These findings suggest the presence of similar 
activities of these described homolog proteins in humans that provides 
insight into diagnosis, treatment, and prognosis of metabolic disorders. 



Polynucleotides encoding proteins with homologies to aralar 1 , syntaxin 1 A, 
or cpo are suitable to investigate diseases and disorders as described 
above. Further new compositions useful in diagnosis, treatment, and 
prognosis of diseases and disorders as described above are provided. 

Before the present proteins, nucleotide sequences, and methods are 
described, it is understood that this invention is not limited to the particular 
methodology, protocols, cell lines, vectors, and reagents described as 
these may vary. It is also to be understood that the terminology used 
herein is for the purpose of describing particular embodiments only, and is 
not intended to limit the scope of the present invention that will be limited 
only by the appended claims. Unless defined otherwise, all technical and 
scientific terms used herein have the same meanings as commonly 
understood by one of ordinary skill in the art to which this invention 
belongs. Although any methods and materials similar or equivalent to those 
described herein can be used in the practice or testing of the present 
invention, the preferred methods, devices, and materials are now 
described. All publications mentioned herein are incorporated herein by 
reference for the purpose of describing and disclosing the cell lines, 
vectors, and methodologies that are reported in the publications which 
might be used in connection with the invention. Nothing herein is to be 
construed as an admission that the invention is not entitled to antedate 
such disclosure. 

The present invention discloses that aralar 1, syntaxin 1 A, or cpo 
homologous proteins are regulating the energy homeostasis and fat 
metabolism especially the metabolism and storage of triglycerides, and 
polynucleotides, which identify and encode the proteins disclosed in this 
invention. The invention also relates to vectors, host cells, antibodies, and 
recombinant methods for producing the polypeptides and polynucleotides 
of the invention. The invention also relates to the use of these sequences 
in the diagnosis, study, prevention, and treatment of diseases and 
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disorders, for example, but not limited to, metabolic diseases such as 
obesity as well as related disorders such as eating disorder, cachexia, 
diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, gallstones, cancer, e.g. 
5 cancers of the reproductive organs, and sleep apnea. 

Aralarl, syntaxinIA, or cpo homologous proteins and nucleic acid 
molecules coding therefore are obtainable from insect or vertebrate 
species, e.g. mammals or birds. Particularly preferred are homologous 

10 nucleic acids, particularly nucleic acids encoding a human solute carrier 
family 25 (mitochondrial carrier, aralar), member 12 protein (GenBank 
Accession Number XP_01 0876.3 for the protein, XM 010876 for the 
cDNA), a human solute carrier family 25, member 1 3 protein (citrin) 
(GenBank Accession Number NP_055066.1 for the protein, NM_014251 

is for the cDNA), a human syntaxin 1 B2 protein (GenBank Accession Number 
NP 443106.1 for the protein, NM_052874 for the cDNA), a human 
syntaxin 1B protein (GenBank Accession Number NP_003154.1 for the 
protein, NM_003163 for the cDNA), a human RNA-binding protein with 
multiple splicing (RBPMS; GenBank Accession Number XP_047075.1 for 

20 the protein, XM_047075 for the cDNA), or a human protein similar to 
RNA-binding protein with multiple splicing (RBP-MS; GenBank Accession 
Number XP_091097 for the protein, XM_091097 for the cDNA). 

The invention particularly relates to a nucleic acid molecule encoding a 
25 polypeptide contributing to regulating the energy homeostasis and the 
metabolism of triglycerides, wherein said nucleic acid molecule comprises 
(a) the nucleotide sequence of (i) aralarl (GadFly Accession Number 
CG21 39), a human solute carrier family 25 (mitochondrial carrier, Aralar), 
member 12 (GenBank Accession Number XP 01 0876.3 for the protein, 
30 XM_010876 for the cDNA), or a human solute carrier family 25, member 
13(citrin) (GenBank Accession Number NP_055066.1 for the protein, 
NM_0 14251 for the cDNA), (ii) syntaxinIA (GadFly Accession Number 
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CG18615), a human syntaxin 1B2 (GenBank Accession Number 
NP_443106.1 for the protein, NM_052874 for the cDNA), or a human 
syntaxin 1B (GenBank Accession Number NPJD03154.1 for the protein, 
NM_003163 for the cDNA), or (Hi) cpo (GadFly Accession Number 

5 CG18434), SEQ ID NO:1 (GadFly Accession Number CG31243), a human 
RNA-binding protein gene with multiple splicing (RBPMS; GenBank 
Accession Number XP_047075.1 for the protein, XM_047075 for the 
cDNA), and a human gene similar to RNA-binding protein with multiple 
splicing (RBP-MS; GenBank Accession Number XP_091097 for the protein, 

10 XM_091097 for the cDNA), and/or a sequence complementary thereto, 

(b) a nucleotide sequence which hybridizes at 50°C in a solution 
containing 1 x SSC and 0.1 % SDS to a sequence of (a), 

(c) a sequence corresponding to the sequences of (a) or (b) within the 
degeneration of the genetic code, 

15 (d) a sequence which encodes a polypeptide which is at least 85%, 
preferably at least 90%, more preferably at least 95%, more preferably at 
least 98% and up to 99,6% identical to the amino acid sequences of an 
aralarl, syntaxinIA, or cpo protein, preferably of a human solute carrier 
family- 25 (mitochondrial carrier, Aralar), member 12 protein (GenBank 

20 Accession Number XP_01 0876.3 for the protein, XM_010876 for the 
cDNA), a human solute carrier family 25, member 13 protein (citrin) 
(GenBank Accession Number NP_055066.1 for the protein, NM_014251 
for the cDNA), a human syntaxin 1 B2 protein (GenBank Accession Number 
NP443106.1 for the protein, NM 052874 for the cDNA), a human 

25 syntaxin 1B protein (GenBank Accession Number NP_003154.1 for the 
protein, NM_003163 for the cDNA), a human RNA-binding protein with 
multiple splicing (RBPMS; GenBank Accession Number XP_047075.1 for 
the protein, XM_047075 for the cDNA), and a human protein similar to 
RNA-binding protein with multiple splicing (RBP-MS; GenBank Accession 

30 Number XPJ391097 for the protein, XM_091097 for the cDNA), 



10 



15 



20 



25 



30 



- 10- 

(e) a sequence which differs from the nucleic acid molecule of (a) to (d) 
by mutation and wherein said mutation causes an alteration, deletion, 
duplication and/or premature stop in the encoded polypeptide or 

(f) a partial sequence of any of the nucleotide sequences of (a) to (e) 
having a length of at least 15 bases, preferably at least 20 bases, more 
preferably at least 25 bases and most preferably at least 50 bases. 

The invention is based on the finding that aralarl, syntaxinIA, or cpo 
homologous proteins (herein referred to as aralarl , syntaxin 1 A, or cpo) and 
the polynucleotides encoding these, are involved in the regulation of 
triglyceride storage and therefore energy homeostasis. The invention 
describes the use of these compositions for the diagnosis, study, 
prevention, or treatment of diseases and disorders related thereto, 
including metabolic diseases such as obesity as well as related disorders 
such as eating disorder, cachexia, diabetes mellitus, hypertension, 
coronary heart disease, hypercholesterolemia, dyslipidemia, osteoarthritis, 
gallstones, cancer, e.g. cancers of the reproductive organs, and sleep 
apnea. 

Accordingly, the present invention relates to genes with novel functions in 
body-weight regulation, energy homeostasis, metabolism, and obesity. To 
find genes with novel functions in energy homeostasis, metabolism, and 
obesity, a functional genetic screen was performed with the model 
organism Drosophila melanogaster (Meigen). One resource for screening 
was a Drosophila melanogaster stock collection of EP-lines. The P-vector of 
this collection has Gal4-UAS-binding sites fused to a basal promoter that 
can transcribe adjacent genomic Drosophila sequences upon binding of 
Gal4 to UAS-sites. This enables the EP-line collection for overexpression of 
endogenous flanking gene sequences. In addition, without activation of the 
UAS-sites, integration of the EP-element into the gene is likely to cause a 
reduction of gene activity, and allows determining its function by 
evaluating the loss-of-function phenotype. 



Obese people mainly show a significant increase in the content of 
triglycerides. Triglycerides are the most efficient storage for energy in cells. 
In order to isolate genes with a function in energy homeostasis, several 
thousand proprietary EP-lines were tested for their triglyceride content after 
a prolonged feeding period (see Examples for more detail). Lines with 
significantly changed triglyceride content were selected as positive 
candidates for further analysis. The increase or decrease of triglyceride 
content due to the loss of a gene function suggests gene activities in 
energy homeostasis in a dose dependent manner that controls the amount 
of energy stored as triglycerides. 

In this invention, the content of triglycerides of a pool of flies with the 
same genotype after feeding for six days was analyzed using a triglyceride 
assay. Male flies homozygous or heterozygous for the integration of 
vectors for Drosophila lines EP{3)3675, EP(3)3215, or EP{3)0661 were 
analyzed in an assay measuring the triglyceride contents of these flies, 
illustrated in more detail in the EXAMPLES section. The results of the 
triglyceride content analysis are shown in FIGURES 1, 4, and 7, 
respectively. 

Genomic DNA sequences were isolated that are localized to the EP vector 
(herein EP(3)3675, EP(3)3215, or EP(3)0661) integration. Using those 
isolated genomic sequences public databases like Berkeley Drosophila 
Genome Project (GadFly) were screened thereby identifying the integration 
site of the vectors, and the corresponding genes, described in more detail 
in the EXAMPLES section. The molecular organization of the genes is 
shown in FIGURES 2, 5, and 8, respectively. 

An additional screen using Drosophila mutants with modifications of the 
eye phenotype identified an interaction of cpo with adipose, a protein 
regulating, causing or contributing to obesity. An additional screen using 
Drosophila mutants with modifications of the eye phenotype identified a 
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modification of UCP activity by cpo, thereby leading to an altered 
mitochondrial activity. 

The present invention further describes polypeptides comprising the amino 
acid sequences of aralarl, syntaxinIA, or cpo and homologous proteins. 
Based upon homology, the proteins of the invention and each homologous 
protein or peptide may share at least some activity. No functional data 
described the regulation of body weight control and related metabolic 
diseases are available in the prior art for the genes of the invention. 

The invention also encompasses polynucleotides that encode aralarl, 
syntaxinIA, or cpo and homologous proteins. Accordingly, any nucleic acid 
sequence, which encodes the amino acid sequences of aralarl, 
syntaxinIA, or cpo and homologous proteins, can be used to generate 
recombinant molecules that express aralarl, syntaxinIA, or cpo and 
homologous proteins. In a particular embodiment, the invention 
encompasses a nucleic acid encoding the aralarl protein (GadFly 
Accession Number CG2139), a human solute carrier family 25 
(mitochondrial carrier, Aralar), member 12 protein (GenBank Accession 
Number XP_01 0876.3 for the. protein, XM_010876 for the cDNA), a 
human solute carrier family 25, member 13 protein (citrin) (GenBank 
Accession Number NP_055066.1 for the protein, NM_014251 for the 
cDNA), syntaxinIA protein (GadFly Accession Number CG18615), a 
human syntaxin 1B2 protein (GenBank Accession Number NP_443106.1 
for the protein, NM_052874 for the cDNA), a human syntaxin 1B 
(GenBank Accession Number NP_003154.1 for the protein, NM_003163 
for the cDNA), cpo (GadFly Accession Number CGI 8434), SEQ ID NO:1 
(GadFly Accession Number CG31243), a human RNA-binding protein with 
multiple splicing (RBPMS; GenBank Accession Number XP_047075.1 for 
the protein, XM_047075 for the cDNA), or a human protein similar to 
RNA-binding protein with multiple splicing (RBP-MS; GenBank Accession 
Number XP_091097 for the protein, XM_091097 for the cDNA). It will be 



appreciated by those skilled in the art that as a result of the degeneracy of 
the genetic code, a multitude of nucleotide sequences encoding the 
proteins, some bearing minimal homology to the nucleotide sequences of 
any known and naturally occurring gene, may be produced. Thus, the 
invention contemplates each and every possible variation of nucleotide 
sequence that could be made by selecting combinations based on possible 
codon choices. These combinations are made in accordance with the 
standard triplet genetic code as applied to the nucleotide sequences of 
naturally occurring aralarl, syntaxinIA, or cpo and homologous proteins, 
and all such variations are to be considered as being specifically disclosed. 
Although nucleotide sequences, which encode the proteins, and their 
variants are preferably capable of hybridising to the nucleotide sequences 
of the naturally occurring proteins under appropriately selected conditions 
of stringency, it may be advantageous to produce nucleotide sequences 
encoding the proteins or their derivatives possessing a substantially 
different codon usage. Codons may be selected to increase the rate at 
which expression of the peptide occurs in a particular prokaryotic or 
eukaryotic host in accordance with the frequency with which particular 
codons are utilized by the host. Other reasons for substantially altering the 
nucleotide sequence encoding the proteins and their derivatives without 
altering the encoded amino acid sequences include the production of RNA 
transcripts having more desirable properties, such as a greater half-life, 
than transcripts produced from the naturally occurring sequences. The 
invention also encompasses production of DNA sequences, or portions 
thereof, which encode the proteins and their derivatives, entirely by 
synthetic chemistry. After cproduction, the synthetic sequence may be 
inserted into any of the many available expression vectors and cell systems 
using reagents that are well known in the art at the time of the filing of this 
application. Moreover, synthetic chemistry may be used to introduce 
mutations into a sequence encoding the protein or any portion thereof. 
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Also encompassed by the invention are polynucleotide sequences that are 
capable of hybridizing to the claimed nucleotide sequences, and in 
particular, those of the polynucleotide of aralarl (GadFly Accession 
Number CG2139), a human solute carrier family 25 (mitochondrial carrier, 
Aralar), member 12 (GenBank Accession Number XP_01 0876.3 for the 
protein, XM_010876 for the cDNA), a human solute carrier family 25, 
member 13 (citrin) (GenBank Accession Number NP_055066.1 for the 
protein, NM_014251 for the cDNA), syntaxinIA (GadFly Accession 
Number CG18615), a human syntaxin 1B2 (GenBank Accession Number 
NP_443106.1 for the protein, NM 052874 for the cDNA), a human 
syntaxin 1B (GenBank Accession Number NP_003154.1 for the protein, 
NM_003163 for the cDNA), cpo (GadFly Accession Number CG18434), 
SEQ ID NO:1 (GadFly Accession Number CG31 243), a human RNA-binding 
protein gene with multiple splicing (RBPMS; GenBank Accession Number 
XP_047075.1 for the protein, XM_047075 for the cDNA), or a human 
gene similar to RNA-binding protein with multiple splicing (RBP-MS; 
GenBank Accession Number XP_091097 for the protein, XM_091097 for 
the cDNA), under various conditions of stringency. Hybridization conditions 
are based on the melting temperature (Tm) of the nucleic acid binding 
complex or probe, as taught in Wahl, G. M. and S. L. Berger (1987: 
Methods Enzymol. 152:399-407) and Kimmel, A. R. (1987; Methods 
Enzymol. 152:507-511), and may be used at a defined stringency. 
Preferably, hybridization under stringent conditions means that after 
washing for 1 h with 1 x SSC and 0.1 % SDS at 50°C, preferably at 55°C, 
more preferably at 62°C and most preferably at 68°C, particularly for 1 h 
in 0.2 x SSC and 0.1 % SDS at 50°C, preferably at 55 °C, more preferably 
at 62°C and most preferably at 68°C, a positive hybridization signal is 
observed. Altered nucleic acid sequences encoding the proteins which are 
encompassed by the invention include deletions, insertions, or 
substitutions of different nucleotides resulting in a polynucleotide that 
encodes the same or a functionally equivalent protein. 
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The encoded proteins may also contain deletions, insertions, or 
substitutions of amino acid residues, which produce a silent change and 
result in functionally equivalent proteins. Deliberate amino acid 
substitutions may be made on the basis of similarity in polarity, charge, 
solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of 
the residues as long as the biological activity of the protein is retained. For 
example, negatively charged amino acids may include aspartic acid and 
glutamic acid; positively charged amino acids may include lysine and 
arginine; and amino acids with uncharged polar head groups having similar 
hydrophilicity values may include leucine, isoleucine, and valine; glycine or 
alanine; asparagine and glutamine; serine and threonine; phenylalanine and 
tyrosine. 

Also included within the scope of the present invention are alleles of the 
genes encoding aralarl, syntaxinl A, or cpo and homologous proteins. As 
used herein, an "allele" or "allelic sequence" is an alternative form of the 
gene, which may result from at least one mutation in the nucleic acid 
sequence. Alleles may result in altered mRNAs or polypeptides whose 
• structures or function may or may not be altered. Any given gene may 
have none, one, or many allelic forms. Common mutational changes, which 
give rise to alleles, are generally ascribed to natural deletions, additions, or 
substitutions of nucleotides. Each of these types of changes may occur 
alone, or in combination with the others, one or more times in a given 
sequence. Methods for DNA sequencing which are well known and 
generally available in the art may be used to practice any embodiments of 
the invention. The methods may employ such enzymes as the Klenow 
fragment of DNA polymerase I, SEQUENASE DNA Polymerase (US 
Biochemical Corp, Cleveland Ohio), Taq polymerase (Perkin Elmer), 
thermostable T7 polymerase (Amersham, Chicago, III.), or combinations of 
recombinant polymerases and proof-reading exonucleases such as the 
ELONQASE Amplification System (GIBCO/BRL, Gaithersburg, Md.). 
Preferably, the process is automated with machines such as the Hamilton 
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MICROLAB 2200 (Hamilton, Reno Nev.), Peltier thermal cycler (PTC200; 
MJ Research, Watertown, Mass.) and the ABI 377 DNA sequencers (Perkin 
Elmer). 

The nucleic acid sequences encoding aralarl, syntaxinIA, or cpo and 
homologous proteins may be extended utilizing a partial nucleotide 
sequence and employing various methods known in the art to detect 
upstream sequences such as promoters and regulatory elements. For 
example, one method which may be employed, "restriction-site" PCR, uses 
universal primers to retrieve unknown sequence adjacent to a known locus 
(Sarkar, G. (1993) PCR Methods Applic. 2:318-322). In particular, genomic 
DNA is first amplified in the presence of primer to linker sequence and a 
primer specific to the known region. The amplified sequences are then 
subjected to a second round of PCR with the same linker primer and 
another specific primer internal to the first one. Products of each round of 
PCR are transcribed with an appropriate RNA polymerase and sequenced 
using reverse transcriptase. Inverse PCR may also be used to amplify or 
extend sequences using divergent primers based on a known region 
(Triglia, T. et al. (1 988) Nucleic Acids Res. 1 6:81 86). The primersmay be 
designed using OLIGO 4.06 primer analysis software (National Biosciences 
Inc., Plymouth, Minn.), or another appropriate program, to 22-30 
nucleotides in length, to have a GC content of 50% or more, and to anneal 
to the target sequence at temperatures about 68-72 °C. The method uses 
several restriction enzymes to generate suitable fragments. The fragment 
is then circularized by intramolecular ligation and used as a PCR template. 

Another method which may be used is capture PCR which involves PCR 
amplification of DNA fragments adjacent to a known sequence in human 
and yeast artificial chromosome DNA (Lagerstrom, M. et al. (PCR Methods 
Applic. 1:111-119). In this method, multiple restriction enzyme digestions 
and ligations also are used to place an engineered double-stranded 



sequence into an unknown portion of the DNA molecule before performing 
PCR. 

Another method which may be used to retrieve unknown sequences is that 
of Parker, J. D. et al. (1991; Nucleic Acids Res. 19:3055-3060). 
Additionally, one may use PCR, nested primers, and PROMOTERFINDER 
libraries to walk in genomic DNA (Clontech, Palo Alto, Calif.). This process 
avoids the need to screen libraries and is useful in finding intron/exon 
junctions. 

When screening for full-length cDNAs, it is preferable to use libraries that 
have been size-selected to include larger cDNAs. Also, random-primed 
libraries are preferable, in that they will contain more sequences, which 
contain the 5' regions of genes. Use of a randomly primed library may be 
especially preferable for situations in which an oligo d(T) library does not 
yield a full-length cDNA. Genomic libraries may be useful for extension of 
sequence into the 5' and 3' non-transcribed regulatory regions. Capillary 
electrophoresis systems, which are commercially available, may be used to 
analyze the size or confirm the nucleotide sequence of sequencing or PCR 
products. In particular, capillary sequencing may employ flowable polymers 
for electrophoretic separation, four different fluorescent dyes (one for each 
nucleotide) which are laser activated, and detection of the emitted 
wavelengths by a charge coupled devise camera. Output/light intensity 
may be converted to electrical signal using appropriate software (e.g. 
GENOTYPER and SEQUENCE NAVIGATOR, Perkin Elmer) and the entire 
process from loading of samples to computer analysis and electronic data 
display may be computer controlled. Capillary electrophoresis is especially 
preferable for the sequencing of small pieces of DNA, which might be 
present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotide sequences or 
fragments thereof which encode aralarl, syntaxinIA, or cpo and 
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homologous proteins, or fusion proteins or functional equivalents thereof, 
may be used in recombinant DNA molecules to direct expression of the 
proteins in appropriate host cells. Due to the inherent degeneracy of the 
genetic code, other DNA sequences, which encode substantially the same, 
or a functionally equivalent amino acid sequence may be produced and 
these sequences may be used to clone and express the proteins. As will be 
understood by those of skill in the art, it may be advantageous to produce 
protein-encoding nucleotide sequences possessing non-naturally occurring 
codons. For example, codons preferred by a particular prokaryotic or 
eukaryotic host can be selected to increase the rate of protein expression 
or to produce a recombinant RNA transcript having desirable properties, 
such as a half-life which is longer than that of a transcript generated from 
the naturally occurring sequence. The nucleotide sequences of the present 
invention can be engineered using methods generally known in the art in 
order to alter protein-encoding sequences for a variety of reasons, 
including but not limited to, alterations which modify the cloning, 
processing, and/or expression of the gene product. DNA shuffling by 
random fragmentation and PCR reassembly of gene fragments and 
synthetic oligonucleotides may be used to engineer the nucleotide 
sequences. For example, site-directed mutagenesis may be used to insert 
new restriction sites, alter glycosylation patterns, change codon 
preference, produce splice variants, or introduce mutations, and so forth. 

In another embodiment of the invention, natural, modified, or recombinant 
nucleic acid sequences encoding aralarl, syntaxinIA, or cpo and 
homologous proteins may be ligated to a heterologous sequence to encode 
a fusion protein. For example, to screen libraries, e.g. peptide libraries or 
low-molecular weight compound libraries for inhibitors of aralarl, 
syntaxinIA, or cpo and homologous protein activities, it may be useful to 
encode chimeric proteins that can be recognized by a commercially 
available antibodies. A fusion protein may also be engineered to contain a 
cleavage site located between the desired protein-encoding sequence and 
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the heterologous protein sequence so that the desired protein may be 
cleaved and purified away from the heterologous moiety. In another 
embodiment, sequences encoding the protein may be synthesized, in 
whole or in part, using chemical methods well known in the art (see 
Caruthers, M. H. et al. (1980) Nucl. Acids Res. Symp. Ser. 7:215-223, 
Horn, T. et al. (1980) Nucl. Acids Res. Symp. Ser. 7:225-232). 
Alternatively, the proteins themselves may be produced using chemical 
methods to synthesize the amino acid sequence of the protein, or a portion 
thereof. For example, peptide synthesis can be performed using various 
solid-phase techniques (Roberge, J. Y. et al. (1 995) Science 269:202-204) 
and automated synthesis may be achieved, for example, using the ABI 
431 A peptide synthesizer (Perkin Elmer). The newly synthesized peptide 
may be substantially purified by preparative high performance liquid 
chromatography (e.g., Creighton, T. (1983) Proteins, Structures and 
Molecular Principles, WH Freeman and Co., New York, N.Y.). The 
composition of the synthetic peptides may be confirmed by amino acid 
analysis or sequencing (e.g., the Edman degradation procedure; Creighton, 
supra). Additionally, the amino acid sequences of the proteins, or any part 
thereof, may be altered during direct synthesis and/or combined using 
chemical methods with sequences from other proteins, or any part thereof, 
to produce a variant polypeptide. 

In order to express a biologically active protein, the nucleotide sequences 
encoding the proteins or functional equivalents, may be inserted into 
appropriate expression vectors, i.e., a vector, which contains the 
necessary elements for the transcription and translation of the inserted 
coding sequence. Methods, which are well known to those skilled in the 
art, may be used to construct expression vectors containing sequences 
encoding the proteins and appropriate transcriptional and translational 
control elements. These methods include in vitro recombinant DNA 
techniques, synthetic techniques, and in vivo genetic recombination. Such 
techniques are described in Sambrook, J. et al. (1989) Molecular Cloning, 
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A Laboratory Manual, Cold Spring Harbor Press, Plainview, N.Y., and 
Ausubel, F. M. et al. (1989) Current Protocols in Molecular Biology, John 
Wiley & Sons, New York, N.Y. 

A variety of expression vector/host systems may be utilized to contain and 
express sequences encoding the proteins. These include, but are not 
limited to, micro-organisms such as bacteria transformed with recombinant 
bacteriophage, plasmid, or cosmid DNA expression vectors; yeast 
transformed with yeast expression vectors; insect cell systems infected 
with virus expression vectors (e.g., baculovirus); plant cell systems 
transformed with virus expression vectors (e.g., cauliflower mosaic virus, 
CaMV; tobacco mosaic virus, TMV) or with bacterial expression vectors 
(e.g., Ti or PBR322 plasmids); or animal cell systems. The "control 
elements" or "regulatory sequences" are those non-translated regions of 
the vector-enhancers, promoters, 5' and 3' untranslated regions which 
interact with host cellular proteins to carry out transcription and 
translation. Such elements may vary in their strength and specificity. 
Depending on the vector system and host utilized, any number of suitable 
transcription and translation elements, including constitutive and inducible 
promoters, may be used. For example, when cloning in bacterial systems, 
inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT 
phagemid (Stratagene, LaJolla, Calif.) or PSPORT1 plasmid (Gibco BRL) and 
the like may bis used. The baculovirus polyhedrin promoter may be used in 
insect cells. Promoters and enhancers derived from the genomes of plant 
cells (e.g., heat shock, RUBISCO; and storage protein genes) or from plant 
viruses (e.g., viral promoters and leader sequences) may be cloned into the 
vector. In mammalian cell systems, promoters from mammalian genes or 
from mammalian viruses are preferable. If it is necessary to generate a cell 
line that contains multiple copies of the sequences encoding the protein, 
vectors based on SV40 or EBV may be used with an appropriate selectable 
marker. 



- 21 - 

In bacterial systems, a number of expression vectors may be selected 
depending upon the use intended for the protein. For example, when large 
quantities of protein are needed for the induction of antibodies, vectors, 
which direct high level expression of fusion proteins that are readily 
purified, may be used. Such vectors include, but are not limited to, the 
multifunctional E. coli cloning and expression vectors such as the 
BLUESCRIPT phagemid (Stratagene), in which the sequences encoding the 
protein may be ligated into the vector in frame with sequences for the 
amino-terminal Met and the subsequent 7 residues of R-galactosidase so 
that a hybrid protein is produced; pIN vectors (Van Heeke, G. and S. M. 
Schuster (1989) J. Biol. Chem. 264:5503-5509); and the like. PGEX 
vectors (Promega, Madison, Wis.) may also be used to express foreign 
polypeptides as fusion proteins with Glutathione S-Transf erase (GST). In 
general, such fusion proteins are soluble and can easily be purified from 
lysed cells by adsorption to glutathione-agarose beads followed by elution 
in the presence of free glutathione. Proteins made in such systems may be 
designed to include heparin, thrombin, or factor XA protease cleavage sites 
so that the cloned polypeptide of interest can be released from the GST 
moiety at will. In the yeast, Saccharomyces cerevisiae, a number of 
vectors containing constitutive or inducible promoters such as alpha factor, 
alcohol oxidase, and PGH may be used. For reviews, see Ausubel et al., 
(supra) and Grantet al. (1987) Methods Enzymol. 153:516-544. 

In cases where plant expression vectors are used, the expression of 
sequences encoding the proteins may be driven by any of a number of 
promoters. For example, viral promoters such as the 35S and 19S 
promoters of CaMV may be used alone or in combination with the omega 
leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 6:307-31 1). 
Alternatively, plant promoters such as the small subunit of RUBISCO or 
heat shock promoters may be used (Coruzzi, G. et al. (1984) EMBO J. 
3:1671-1680; Broglie, R. et al. (1984) Science 224:838-843; and Winter, 
J. et al. (1991) Results Probl. Cell Differ. 17:85-105). These constructs 
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can be introduced into plant cells by direct DNA transformation or 
pathogen-mediated transfection. Such techniques are described in a 
number of generally available reviews (see, for example, Hobbs, S. or 
Murry, L. E. in McGtaw Hill Yearbook of Science and Technology (1992) 
5 McGraw Hill, New York, N.Y.; pp. 191-196). 

An insect system may also be used to express the proteins. For example, 
in one such system, Autographa californica nuclear polyhedrosis virus 
(AcNPV) is used as a vector to express foreign genes in Spodoptera 

o frugiperda cells or in Trichoplusia larvae. The sequences encoding the 
protein may be cloned into a non-essential region of the virus, such as the 
polyhedrin gene, and place under control of the polyhedrin promoter. 
Successful insertions of the protein will render the polyhedrin gene inactive 
and -produce recombinant virus lacking coat protein. The recombinant 

5 viruses may then be used to infect, for example, S. frugiperda cells of 
Trichoplusia larvae in which aralarl, syntaxinIA, or cpo and homologous 
proteins may be expressed (Engelhard, E. K. etal. (1994) Proc. Nat. Acad. 
Sci. 91:3224-3227). 
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In mammalian host cells, a number of viral-based expression systems may 
be utilized. In cases where an adenovirus is used as an expression vector, 
sequences encoding the protein may be ligated into an adenovirus 
transcription/translation complex consisting of the late promoter and 
tripartite leader sequence. Insertion in a non-essential E1 or E3 region of 
the viral genome may be used tp obtain viable viruses which are capable of 
expressing the protein in infected host cells (Logan, J. and Shenk, T. 
(1984) Proc. Natl. Acad. Sci. 81:3655-3659). In addition, transcription 
enhancers, such as the Rous sarcoma virus (RSV) enhancer, may be used 
to increase expression in mammalian host cells. 

Specific initiation signals may also be used to achieve more efficient 
translation of sequences encoding the protein. Such signals include the 
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ATG initiation codon and adjacent sequences. In cases where sequences 
encoding the protein, its initiation codons, and upstream sequences are 
inserted into the appropriate expression vector, no additional transcriptional 
or translational control signals may be needed. However, in cases where 
only coding sequence, or a portion thereof, is inserted, exogenous 
translational control signals including the ATG initiation codon should be 
provided. Furthermore, the initiation codon should be in the correct reading 
frame to ensure translation of the entire insert. Exogenous translational 
elements and initiation codons may be of various origins, both natural and 
synthetic. The efficiency of expression may be enhanced by the inclusion 
of enhancers which are appropriate for the particular cell' system which is 
used, such as those described in the literature (Scharf, D. et al. (1994) 
Results Probl. Cell Differ. 20:125-162). 

In addition, a host cell strain may be chosen for its ability to modulate the 
expression of the inserted sequences or to process the expressed protein 
in the desired fashion. Such modifications of the polypeptide include, but 
are not limited to, acetylation, carboxylation, glycosylation, 
phosphorylation, lipidation, and acylation. Post-translational processing 
which cleaves a "prepro" form of the protein may also be used to facilitate 
correct insertion, folding and/or function. Different host cells such as CHO, 
HeLa, MDCK, HEK293, and WI38, which have specific cellular machinery 
and characteristic mechanisms for such post-translational activities, may 
be chosen to ensure the correct modification and processing of the foreign 
protein. 

For long-term, high-yield production of recombinant proteins, stable 
expression is preferred. For example, cell lines which stably express 
aralarl , syntaxinl A, or cpo and homologous proteins may be generated by 
transformation using expression vectors which may contain viral origins of 
replication and/or endogenous expression elements and a selectable marker 
gene on the same or on a separate vector. Following the introduction of 
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the vector, cells may be allowed to grow for 1-2 days in an enriched media 
before they are switched to selective media. The purpose of the selectable 
marker is to confer resistance to selection, and its presence allows growth 
and recovery of cells, which successfully express the introduced 
sequences. Resistant clones of stably transformed cells may be proliferated 
using tissue culture techniques appropriate to the cell type. Any number of 
selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidine kinase 
(Wigler, M. et al. (1977) Cell 11:223-32) and adenine 
phosphoribosyltransferase (Lowy, I. et al. (1980) Cell 22:817-23) genes, 
which can be employed in tk-or aprt-,cells, respectively. Also, 
antimetabolite, antibiotic or herbicide resistance can be used as the basis 
for selection; for example, dhfr which confers resistance to methotrexate 
(Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. 77:3567-70); npt, which 
confers resistance to the aminoglycosides neomycin and G-418 
(Colbere-Garapin, F. et al (1981) J. Mol. Biol. 150:1-14) and als or pat, 
which confer resistance to chlorsulfuron and phosphinotricin 
acetyltransferase, respectively (Murry, supra). Additional selectable genes 
have been described, for example, trpB, which allows cells to utilise indole 
in place of tryptophan, or hisD, which allows cells to utilise histinol in place 
of histidine (Hartman, S. C. and R. C. Mulligan (1988) Proc. Natl. Acad. 
Sci. 85:8047-51). Recently, the use of visible markers has gained 
popularity with such markers as anthocyanins, 15- glucuronidase and its 
substrate GUS, and luciferase and its substrate luciferin, being widely used 
not only to identify transformants, but also to quantify the amount of 
transient or stable protein expression attributable to a specific vector 
system (Rhodes, C. A. et al. (1995) Methods Mol. Biol. 55:121-131). 

Although the presence/absence of marker gene expression suggests that 
the gene of interest is also present, its presence and expression may need 
to be confirmed. For example, if the sequences encoding the protein of 
interest are inserted within a marker gene sequence, recombinant cells 



- 25 - 

containing sequences encoding the protein can be identified by the 
absence of marker gene function. Alternatively, a marker gene can be 
placed in tandem with sequences encoding the protein under the control of 
a single promoter. Expression of the marker gene in response to induction 
or selection usually indicates expression of the tandem gene as well. 
Alternatively, host cells, which contain and express the nucleic acid 
sequences encoding the protein may be identified by a variety of 
procedures known to those of skill in the art. These procedures include, 
but are not limited to, DNA-DNA, or DNA-RNA hybridization and protein 
bioassay or immunoassay techniques that include membrane, solution, or 
chip based technologies for the detection and/or quantification of nucleic 
acid or protein. 

The presence of polynucleotide sequences encoding aralarl, syntaxinIA, 
or cpo and homologous proteins can be detected by DNA-DNA or 
DNA-RNA hybridization or amplification using probes or portions or 
fragments of polynucleotides encoding aralarl, syntaxinIA, or cpo and 
homologous proteins. Nucleic acid amplification based assays involve the 
use of oligonucleotides or oligomers based on the sequences specific for 
the gene to detect transformants containing DNA or RNA encoding the 
corresponding protein. As used herein "oligonucleotides" or "oligomers" 
refer to a nucleic acid sequence of at least about 10 nucleotides and as 
many as about 60 nucleotides, preferably about 15 to 30 nucleotides, and 
more preferably about 20-25 nucleotides, which can be used as a probe or 
amplimer. 

A variety of protocols for detecting and measuring the expression of 
proteins, using either polyclonal or monoclonal antibodies specific for the 
protein are known in the art. Examples include enzyme-linked 
immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence 
activated cell sorting (FACS). A two-site, monoclonal-based immunoassay 
utilizing monoclonal antibodies reactive to two non-interfering epitopes on 
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the protein is preferred, but a competitive binding assay may be employed. 
These and other assays are described, among other places, in Hampton, R. 
et al. (1990; Serological Methbds, a Laboratory Manual, APS Press, St 
Paul, Minn.) and Maddox, D. E. et al. (1983; J. Exp. Med. 
158:1211-1216). 

A wide variety of labels and conjugation techniques are known by those 
skilled in the art and may be used in various nucleic acid and amino acid 
assays. Means for producing labeled hybridization or PCR probes for 
detecting sequences related to m polynucleotides encoding aralarl , 
syntaxinIA, or cpo and homologous proteins include oligo-labeling, nick 
translation, end-labeling or PCR amplification using a labeled nucleotide. 

Alternatively, the sequences encoding the protein, or any portions thereof 
may be cloned into a vector for the production of an mRNA probe. Such 
vectors are known in the art, are commercially available, and may be used 
to synthesize RNA probes in vitro by addition of an appropriate RNA 
polymerase such as T7, T3, or SP6 and labeled nucleotides. These 
procedures may be conducted using a variety of commercially available kits 
(Pharmacia & Upjohn, (Kalamazoo, Mich.); Promega (Madison Wis.); and 
U.S. Biochemical Corp., (Cleveland, Ohio). 

Suitable reporter molecules or labels, which may be used, include 
radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic 
agents as well as substrates, co-factors, inhibitors, magnetic particles, and 
the like. 

Host cells transformed with nucleotide sequences encoding the protein 
may be cultured under conditions suitable for the expression and recovery 
of the protein from cell culture. The protein produced by a recombinant cell 
may be secreted or contained intracellular^ depending on the sequence 
and/or the vector used. As will be understood by those of skill in the art 
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expression vectors containing polynucleotides which encode the protein 
may be designed to contain signal sequences, which direct secretion of the 
protein through a prokaryotic or eukaryotic cell membrane. Other 
recombinant constructions may be used to join sequences encoding the 
protein to nucleotide sequence encoding a polypeptide domain, which will 
facilitate purification of soluble proteins. Such purification facilitating 
domains include, but are not limited to, metal chelating peptides such as 
histidine-tryptophan modules that allow purification on immobilized metals, 
protein A domains that allow purification on immobilized immunoglobulin, 
and the domain utilized in the FLAG extension/affinity purification system 
(Immunex Corp., Seattle, Wash.) The inclusion of cleavable linker 
sequences such as those specific for Factor XA or Enterokinase 
(Invitrogen, San Diego, Calif.) between the purification domain and the 
desired protein may be used to facilitate purification. One such expression 
vector provides for expression of a fusion protein containing the desired 
protein and a nucleic acid encoding 6 histidine residues preceding a 
Thioredoxine or an Enterokinase cleavage site. The histidine residues 
facilitate purification on IMIAC (immobilised metal ion affinity 
chromatography as described in Porath, J. et al. (1 992, Prot. Exp. Purif . 3: 
263-281)) while the Enterokinase cleavage site provides a means for 
purifying the desired protein from the fusion protein. A discussion of 
vectors which are suitable for the production of fusion proteins is provided 
in Kroll, D. J. et al. (1993; DNA Cell Biol. 12:441-453). In addition to 
recombinant production, fragments of the proteins may be produced by 
direct peptide synthesis using solid-phase techniques (Merrifield J. (1 963) 
J. Am. Chem. Soc. 85:2149-2154). Protein synthesis may be performed 
using manual techniques or by automation. Automated synthesis may be 
achieved, for example, using Applied Biosystems 431 A peptide synthesizer 
(Perkin Elmer). Various fragments of the proteins may be chemically 
synthesized separately and combined using chemical methods to produce 
the full length molecule. 



- 28 - 

Diagnostics and Therapeutics 

The data disclosed in this invention show that the nucleic acids and 
proteins of the invention are useful in diagnostic and therapeutic 
applications implicated, for example but not limited to, in metabolic 
disorders such as obesity as well as related disorders such as eating 
disorder, cachexia, diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, gallstones, cancer, e.g. 
cancers of the reproductive organs, and sleep apnea. Hence, diagnostic 
and therapeutic uses for the aralarl, syntaxinl A, or cpo nucleic acids and 
proteins of the invention are, for example but not limited to, the following: 
(i) protein therapeutic, (ii) small molecule drug target, (iii) antibody target 
(therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) diagnostic 
and/or prognostic marker, (v) gene therapy (gene delivery/gene ablation), 
(vi) research tools, and (vii) tissue regeneration in vitro and in vivo 
(regeneration for all these tissues and cell types composing these tissues 
and cell types derived from these tissues). 

The nucleic acids and proteins of the invention are useful in diagnostic and 
therapeutic applications implicated in various applications as described 
below. For example, but not limited to, cDNAs encoding the aralarl, 
syntaxinl A, or cpo proteins of the invention and particularly their human 
homologues may be useful in gene therapy, and the aralarl , syntaxinl A, or 
cpo proteins of the invention and particularly their human homologues may 
be useful when administered to a subject in need thereof. By way of 
non-limiting example, the compositions of the present invention will have 
efficacy for treatment of patients suffering from, for example, but not 
limited to, in metabolic disorders as described above. 

The novel nucleic acid encoding the aralarl, syntaxinl A, or cpo protein of 
the invention, or homologous proteins, or fragments thereof, may further 
be useful in diagnostic applications, wherein the presence or amount of the 
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nucleic acids or the proteins are to be assessed. These materials are further 
useful in the generation of antibodies that bind immunospecifically to the 
novel substances of the invention for use in therapeutic or diagnostic 
methods. 

5 

For example, in one aspect, antibodies which are specific for aralarl, 
syntaxinIA, or cpo and homologous proteins may be used directly as an 
antagonist, or indirectly as a targeting or delivery mechanism for bringing 
a pharmaceutical agent to cells or tissue which express the protein. The 
10 antibodies may be generated using methods that are well known in the art. 
Such antibodies may include, but are not limited to, polyclonal, 
monoclonal, chimerical, single chain. Fab fragments, and fragments 
produced by a Fab expression library. Neutralising antibodies, (i.e., those 
which inhibit dimer formation) are especially preferred for therapeutic use. 

15 

For the production of antibodies, various hosts including goats, rabbits, 
rats, mice, humans, and others, may be immunized by injection with the 
protein or any fragment or oligopeptide thereof which has immunogenic 
properties. Depending on the host species, various adjuvants may be used 

20 to increase immunological response. Such adjuvants include, but are not 
limited to, Freund's, mineral gels such as aluminium hydroxide, and surface 
active substances such as lysolecithin, pluronic polyols, polyanions, 
peptides, oil emulsions, keyhole limpet hemocyanin, and dinitrophenol. 
Among adjuvants used in human, BCG (Bacille Calmette-Guerin) and 

25 Corynebacterium parvum are especially preferable. It is preferred that the 
peptides, fragments, or oligopeptides used to induce antibodies to the 
protein have an amino acid sequence consisting of at least five amino 
acids, and more preferably at least 10 amino acids. It is preferable that 
they are identical to a portion of the amino acid sequence of the natural 

30 protein, and they may contain the entire amino acid sequence of a small, 
naturally occurring molecule. Short stretches of aralarl, syntaxinIA, or cpo 
and homologous protein amino acids may be fused with those of another 
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protein such as keyhole limpet hemocyanin in order to increase the 
immunogenicity. 

Monoclonal antibodies to the proteins may be prepared using any 
technique that provides for the production of antibody molecules by 
continuous cell lines in culture. These include, but are not limited to, the 
hybridoma technique, the human B-cell hybridoma technique, and the 
EBV-hybridoma technique (K6hler, G. et al. (1975) Nature 256:495-497; 
Kozbor, D. et al. (1985) J. Immunol. Methods 81:31-42; Cote, R. J. et al. 
Proc. Natl. Acad. Sci. 80:2026-2030; Cole, S. P. et al. (1984) Mol. Cell 
Biol. 62:109-120). 

In addition, techniques developed for the production of "chimeric 
antibodies", the splicing of mouse antibody genes to human antibody 
genes to obtain a molecule with appropriate antigen specificity and 
biological activity can be used (Morrison, S. L. et al. (1984) Proc. Natl. 
Acad. Sci. 81:6851-6855; Neuberger, M. S. et al (1984) Nature 
312:604-608; Takeda, S. et al. (1985) Nature 314:452-454). 
Alternatively, techniques described for the production of single chain 
antibodies may be adapted, using methods known in the art, to produce 
single chain antibodies specific for aralarl, syntaxinIA, or cpo and 
homologous proteins. Antibodies with related specificity, but of distinct 
idiotypic composition, may be generated by chain shuffling from random 
combinatorial immunoglobulin libraries (Burton, D. R. (1991) Proc. Natl. 
Acad. Sci. 88:11 120-3). Antibodies may also be produced by inducing in 
vivo production in the lymphocyte population or by screening recombinant 
immunoglobulin libraries or panels of highly specific binding reagents as 
disclosed in the literature (Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. 
86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299). 

Antibody fragments which contain specific binding sites for the proteins 
may also be generated. For example, such fragments include, but are not 



- 31 - 

limited to, the F(ab') 2 fragments which can be produced by Pepsin 
digestion of the antibody molecule and the Fab fragments which can be 
generated by reducing the disulfide bridges of F(ab') 2 fragments. 
Alternatively, Fab expression libraries may be constructed to allow rapid 
and easy identification of monoclonal Fab fragments with the desired 
specificity (Huse, W. D. et al. (1989) Science 254:1275-1281). 

Various immunoassays may be used for screening to identify antibodies 
having the desired specificity. Numerous protocols for competitive binding 
and immunoradiometric assays using either polyclonal or monoclonal 
antibodies with established specificities are well known in the art. Such 
immunoassays typically involve the measurement of complex formation 
between the protein and its specific antibody. A two-site, 
monoclonal-based immunoassay utilising monoclonal antibodies reactive to 
two non-interfering protein epitopes are preferred, but a competitive 
binding assay may also be employed (Maddox, supra). 

In another embodiment of the invention, the polynucleotides encoding 
aralarl, syntaxinIA, or cpo and homologous proteins, or any fragment 
thereof, or antisense molecules, may be used for therapeutic purposes. In 
one aspect, antisense molecules may be used in situations in which it 
would be desirable to block the transcription of the mRNA. In particular, 
cells may be transformed with sequences complementary to 
polynucleotides encoding aralarl, syntaxinIA, or cpo and homologous 
proteins. Thus, antisense molecules may be used to modulate protein 
activity, or to achieve regulation of gene function. Such technology is now 
well know in the art, and sense or antisense oligomers or larger fragments, 
can be designed from various locations along the coding or control regions 
of sequences encoding the proteins. Expression vectors derived from 
retroviruses, adenovirus, herpes or vaccinia viruses, or from various 
bacterial plasmids may be used for delivery of nucleotide sequences to the 
targeted organ, tissue or cell population. Methods, which are well known 
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to those skilled in the art, can be used to construct recombinant vectors, 
which will express antisense molecules complementary to the 
polynucleotides of the genes encoding aralarl, syntaxinIA, or cpo and 
homologous proteins. These techniques are described both in Sambrook et 
al. (supra) and in Ausubel et al. (supra). Genes encoding aralarl, 
syntaxinIA, or cpo and homologous proteins can be turned off by 
transforming a cell or tissue with expression vectors which express high 
levels of polynucleotide which encodes aralarl, syntaxinIA, or cpo and 
homologous proteins or fragments thereof. Such constructs may be used 
to introduce untranslatable sense or antisense sequences into a cell. Even 
in the absence of integration into the DNA, such vectors may continue to 
transcribe RNA molecules until they are disabled by endogenous nucleases. 
Transient expression may last for a month or more with a non-replicating 
vector and even longer if appropriate replication elements are part of the 
vector system. 

As mentioned above, modifications of gene expression can be obtained by 
designing antisense molecules, e.g. DNA, RNA, or PNA, to the control 
regions of the genes encoding aralarl , syntaxinIA, or cpo and homologous 
proteins, i.e., the promoters, enhancers, and introns. Oligonucleotides 
derived from the transcription initiation site, e.g., between positions -10 
and +10 from the start site, are preferred. Similarly, inhibition can be 
achieved using "triple helix" base-pairing methodology. Triple helix pairing 
is useful because it cause inhibition of the ability of the double helix to 
open sufficiently for the binding of polymerases, transcription factors, or 
regulatory molecules. Recent therapeutic advances using triplex DNA have 
been described in the literature (Gee, J. E. et al. (1994) In; Huber, B. E. 
and B. I. Carr, Molecular and Immunologic Approaches, Futura Publishing 
Co., Mt. Kisco, N.Y.). The antisense molecules may also be designed to 
block translation of mRNA by preventing the transcript from binding to 
ribosomes. 
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Ribozymes, enzymatic RNA molecules, may also be used to catalyze the 
specific cleavage of RNA, The mechanism of ribozyme action involves 
sequence-specific hybridization of the ribozyme molecule to complementary 
target RNA, followed by endonucleolytic cleavage. Examples, which may 
be used, include engineered hammerhead motif ribozyme molecules that 
can be specifically and efficiently catalyze endonucleolytic cleavage of 
sequences encoding aralarl, syntaxinIA, or cpo and homologous proteins. 
Specific ribozyme cleavage sites within any potential RNA target are 
initially identified by scanning the target molecule for ribozyme cleavage 
sites which include the following sequences: GUA, GUU, and GUC. Once 
identified, short RNA sequences of between 15 and 20 ribonucleotides 
corresponding to the region of the target gene containing the cleavage site 
may be evaluated for secondary structural features which may render the 
oligonucleotide inoperable. The suitability of candidate targets may also be 
evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Antisense molecules and ribozymes of the invention may be prepared by 
any method known in the art for the synthesis of nucleic acid molecules. 
These include techniques for chemically synthesizing oligonucleotides such 
as solid phase phosphoramidite chemical synthesis. Alternatively, RNA 
molecules may be generated by in vitro and in vivo transcription of DNA 
sequences encoding aralarl, syntaxinIA, or cpo and homologous proteins. 
Such DNA sequences may be incorporated into a variety of vectors with 
suitable RNA polymerase promoters such as T7 or SP6. Alternatively, these 
cDNA constructs that synthesize antisense RNA constitutively or inducibly 
can be introduced into cell lines, cells, or tissues. RNA molecules may be 
modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking 
sequences at the 5' and/or 3' ends of the molecule or the use of 
phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages 
within the backbone of the molecule. This concept is inherent in the 
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production of PNAs and can be extended in all of these molecules by the 
inclusion of non-traditional bases such as inosine, queosine, and 
wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms 
of adenine, cytidine, guanine, thymine, and uridine which are not as easily 
recognized by endogenous endonucleases. 

Many methods for introducing vectors into cells or tissues are available and 
equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo therapy, 
vectors may be introduced into stem cells taken from the patient and 
clonally propagated for autologous transplant back into that same patient. 
Delivery by transfection and by liposome injections may be achieved using 
methods, which are well known in the art. Any of the therapeutic methods 
described above may be applied to any suitable subject including, for 
example, mammals such as dogs, cats, cows, horses, rabbits, monkeys, 
and most preferably, humans. 

An additional embodiment of the invention relates to the administration of 
a pharmaceutical composition, in conjunction with a pharmaceutical^ 
acceptable carrier, for any of the therapeutic effects discussed above. 
Such pharmaceutical compositions may consist of aralarl, syntaxinl A, or 
cpo and homologous nucleic acids or proteins, antibodies to aralarl, 
syntaxinl A, or cpo and homologous proteins, mimetics, agonists, 
antagonists, or inhibitors of aralarl, syntaxinl A, or cpo and homologous 
proteins or nucleic acids. The compositions may be administered alone or 
in combination with at least one other agent, such as stabilizing 
compound, which may be administered in any sterile, biocompatible 
pharmaceutical carrier, including, but not limited to, saline, buffered saline, 
dextrose, and water. The compositions may be administered to a patient 
alone, or in combination with other agents, drugs or hormones. The 
pharmaceutical compositions utilized in this invention may be administered 
by any number of routes including, but not limited to, oral, intravenous, 
intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, 
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transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, 
sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions 
5 may contain suitable pharmaceutically-acceptable carriers comprising 
excipients and auxiliaries, which facilitate processing of the active 
compounds into preparations which, can be used pharmaceutical^. Further 
details on techniques for formulation and administration may be found in 
the latest edition of Remington's Pharmaceutical Sciences (Maack 

10 Publishing Co., Easton, Pa,). Pharmaceutical compositions for oral 
administration can be formulated using pharmaceutical^ acceptable carriers 
well known in the art in dosages suitable for oral administration. Such 
carriers enable the pharmaceutical compositions to be formulated as 
tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, 

15 and the like, for ingestion by the patient. 

Pharmaceutical preparations for oral use can be obtained through 
combination of active compounds with solid excipient, optionally grinding 
a resulting mixture, and processing the mixture of granules/after adding 

20 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 
excipients are carbohydrate or protein fillers, such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; starch from corn, wheat, rice, 
potato, or other plants; cellulose, such as methyl cellulose, 
hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; gums 

25 including Arabic and tragacanth; and proteins such as gelatine and 
collagen. If desired, disintegrating or solubilising agents may be added, 
such as the cross-linked polyvinyl pyrrolidone, agar, alginic acid, or a salt 
thereof, such as sodium alginate. Dragee cores may be used in conjunction 
with suitable coatings, such as concentrated sugar solutions, which may 

30 also contain gum Arabic, talc, polyvinylpyrrolidone, carbopol gel, 
polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable 
organic solvents or solvent mixtures. Dyestuffs or pigments may be added 
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to the tablets or dragee coating for product identification or to characterize 
the quantity of active compound, i.e., dosage. Pharmaceutical 
preparations, which can be used orally, include push-fit capsules made of 
gelatine, as well as soft, sealed capsules made of gelatine and a coating, 
such as glycerol or sorbitol. Push-fit capsules can contain active 
ingredients mixed with a filler or binders, such as lactose or starches, 
lubricants., such as talc.or magnesium stearate, and, optionally, stabilizers. 
In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid, or liquid polyethylene glycol with 
or without stabilizers. 

Pharmaceutical formulations suitable for parenteral administration may be 
formulated in aqueous solutions, preferably in physiologically compatible 
buffers such as Hanks' solution. Ringer's solution, or physiologically 
buffered saline. Aqueous injection suspensions may contain substances, 
which increase the viscosity of the suspension, such as sodium 
carboxymethyl cellulose, sorbitol, ordextran. Additionally, suspensions of 
the active compounds may be prepared as appropriate oily injection 
suspensions. Suitable lipophilic solvents or vehicles include fatty oils such 
as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Optionally, the suspension may also contain 
suitable stabilizers or agents who increase the solubility of the compounds 
to allow for the preparation of highly concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular 
barrier to be permeated are used in the formulation. Such penetrants are 
generally known in the art. 

The pharmaceutical compositions of the present invention may be 
macpoactured in a manner that is known in the art, e.g., by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping, or lyophilizing processes. .The 
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pharmaceutical composition may be provided as a salt and can be formed 
with many acids, including but not limited to, hydrochloric, sulphuric, 
acetic, lactic, tartaric, malic, succinic, etc. Salts tend to be more soluble in 
aqueous or other protonic solvents than are the corresponding free base 
forms. In other cases, the preferred preparation may be a lyophilized 
powder which may contain any or all of the following: 1-50 mM histidine, 
0.1 %-2% sucrose, and 2-7% mannitol, at a pH range of 4.5 to 5.5, that is 
combined with buffer prior to use. After pharmaceutical compositions have 
been prepared, they can be placed in an appropriate container and labeled 
for treatment of an indicated condition. For administration of proteins, such 
labeling would include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include 
compositions wherein the active ingredients are contained in an effective 
amount to achieve the intended purpose. The determination of an effective 
dose is well within the capability of those skilled in the art. For any 
compounds, the therapeutically effective does can be estimated initially 
either in cell culture assays, e.g., of preadipocyte cell lines, or in animal 
models, usually mice, rabbits, dogs, or pigs. The animal model may also be 
used to determine the appropriate concentration range and route of 
administration. Such information can then be used to determine useful 
doses and routes for administration in humans. A therapeutically effective 
dose refers to that amount of active ingredient, for example aralarl, 
syntaxinl A, or cpo and homologous proteins or nucleic acids or fragments 
thereof, antibodies of aralarl, syntaxinl A, or cpo and homologous 
proteins, which is sufficient for. treating a specific condition. Therapeutic 
efficacy and toxicity may be determined by standard pharmaceutical 
procedures in cell cultures or experimental animals, e.g., ED50 (the dose 
therapeutically effective in 50% of the population) and LD50 (the dose 
lethal to 50% of the population). The dose ratio between therapeutic and 
toxic effects is the therapeutic index, and it can be expressed as the ratio, 
LD50/ED50. Pharmaceutical compositions, which exhibit large therapeutic 
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indices, are preferred. The data obtained from cell culture assays and 
animal studies is used in formulating a range of dosage for human use. The 
dosage contained in such compositions is preferably within a range of 
circulating concentrations that include the ED50 with little or no toxicity. 
The dosage varies within this range depending upon the dosage from 
employed, sensitivity of the patient, and the route of administration. The 
exact dosage will be determined by the practitioner, in light of factors 
related to the subject that requires treatment. Dosage and administration 
are adjusted to provide sufficient levels of the active moiety or to maintain 
the desired effect. Factors, which may be taken into account, include the 
severity of the disease state, general health of the subject, age, weight, 
and gender of the subject, diet, time and frequency of administration, drug 
combination(s), reaction sensitivities, and tolerance/response to therapy. 
Long-acting pharmaceutical compositions may be administered every 3 to 
4 days, every week, or once every two weeks depending on half-life and 
clearance rate of the particular formulation. Normal dosage amounts may 
vary from 0.1 to 100,000 micrograms, up to a total dose of about 1 g, 
depending upon the route of administration. Guidance as to particular 
dosages and methods of delivery is provided in the literature and generally 
available to practitioners in the art. Those skilled in the art employ different 
formulations for nucleotides than for proteins or their inhibitors. Similarly, 
delivery of polynucleotides or polypeptides will be specific to particular 
cells, conditions, locations, etc. 

In another embodiment, antibodies which specifically bind to the proteins 
may be used for the diagnosis of conditions or diseases characterized by or 
associated with over- or underexpression of aralarl, syntaxinIA, or cpo 
and homologous proteins, or in assays to monitor patients being treated 
with aralarl, syntaxinIA, or cpo and homologous proteins, agonists, 
antagonists or inhibitors. The antibodies useful for diagnostic purposes may 
be prepared in the same manner as those described above for therapeutics. 
Diagnostic assays include methods which utilize the antibody and a label to 
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detect the protein in human body fluids or extracts of cells or tissues* The 
antibodies may be used with or without modification, and may be labeled 
by joining them, either covalently or non-covalently, with a reporter 
molecule. A wide variety of reporter molecules which are known in the art 
may be used several of which are described above. 

A variety of protocols including ELISA, RIA, and FACS for measuring 
proteins are known in the art and provide a basis for diagnosing altered or 
abnormal levels of gene expression. Normal or standard values for gene 
expression are established by combining body fluids or cell extracts taken 
from normal mammalian subjects, preferably human, with antibodies to the 
protein under conditions suitable for complex formation. The amount of 
standard complex formation may be quantified by various methods, but 
preferably by photometry, means. Quantities of protein expressed in 
control and disease, samples from biopsied tissues are compared with the 
standard values. Deviation between standard and subject values 
establishes the parameters for diagnosing disease. 

In another embodiment of the invention, the polynucleotides specific for 
aralarl, syntaxinIA, or cpo and homologous proteins may be used for 
diagnostic purposes. The polynucleotides, which may be used, include 
oligonucleotide sequences, antisense RNA and DNA molecules, and PNAs. 
The polynucleotides may be used to detect and quantitate gene expression 
in biopsied tissues in which gene expression may be correlated with 
disease. The diagnostic assay may be used to distinguish between 
absence, presence, and excess gene expression, and to monitor regulation 
of protein levels during therapeutic intervention. 

In one aspect, hybridization with PCR probes which are capable of 
detecting polynucleotide sequences, including genomic sequences, 
encoding aralarl, syntaxinIA, or cpo and homologous proteins or closely 
related molecules, may be used to identify nucleic acid sequences which 
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encode the respective protein. The specificity of the probe, whether it is 
made from a highly specific region, e.g., unique nucleotides in the 5' 
regulatory region, or a less specific region, e.g., especially in the 3' coding 
region, and the stringency of the hybridization or amplification (maximal, 
high, intermediate, or low) will determine whether the probe identifies only 
naturally occurring sequences, alleles, or related sequences. Probes may 
also be used for the detection of related sequences, and should preferably 
contain at least 50% of the nucleotides from any of the. aralarl, 
syntaxinIA, or cpo and homologous protein-encoding sequences. The 
hybridization probes of the subject invention may be DNA or RNA and 
derived from the nucleotide sequence of the polynucleotide comprising 
aralarl (GadFly Accession Number CG21 39), a human solute carrier family 
25 (mitochondrial carrier, Aralar), member 1 2 (GenBank Accession Number 
XP_01 0876.3 for the protein, XM_010876 for the cDNA), a human solute 
carrier family 25, member 13(citrin) (GenBank Accession Number 
NP 055066.1 for the protein, NM_014251 for the cDNA), syntaxinIA 
(GadFly Accession Number CG18615), a human syntaxin 1B2 (GenBank 
Accession Number NP_443106.1 for the protein, NM_052874 for the 
cDNA), a human syntaxin 1B (GenBank Accession Number NP_003154.1 
for the protein, NM_003163 for the cDNA), cpo (GadFly Accession 
Number CG 18434), SEQ ID NO:1 (GadFly Accession Number CG31243), 
a human RNA-binding protein gene with multiple splicing (RBPMS; 
GenBank Accession Number XP_047075.1 for the protein, XM_047075 for 
the cDNA), or a human gene similar to RNA-binding protein with multiple 
splicing (RBP-MS; GenBank Accession Number XP_091097 for the protein, 
XM_091097 for the cDNA) or from a genomic sequence including 
promoter, enhancer elements, and introns of the naturally occurring gene. 
Means for producing specific hybridization probes for DNAs encoding 
aralarl , syntaxinIA, or cpo and homologous proteins include the cloning of 
nucleic acid sequences specific for aralarl, syntaxinIA, or cpo and 
homologous proteins into vectors for the production of mRNA probes. 
Such vectors are known in the art, commercially available, and may be 
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used to synthesize RNA probes In vitro by means of the addition of the 
appropriate RNA polymerases and the appropriate labeled nucleotides. 
Hybridization probes may be labeled by a variety of reporter groups, for 
example, radionuclides such as 32 P or 35 S, or enzymatic labels, such as 
alkaline phosphatase coupled to the probe via avidin/biotin coupling 
systems, and the like. 

Polynucleotide sequences specific for aralarl, syntaxinIA, or cpo and 
homologous nucleic acids may be used for the diagnosis of conditions or 
diseases, which are associated with the expression of the proteins. 
Examples of such conditions or diseases include, but are not limited to, 
pancreatic diseases and disorders, including diabetes. Polynucleotide 
sequences specific for aralarl, syntaxinIA, or cpo and homologous 
proteins may also be used to monitor the progress of patients receiving 
treatment for pancreatic diseases and disorders, including diabetes. The 
polynucleotide sequences may be used in Southern or Northern analysis, 
dot blot, or other membrane-based technologies; in PCR technologies; or in 
dip stick, pin, ELISA or chip assays utilizing fluids or tissues from patient 
biopsies to detect altered gene expression. Such qualitative or quantitative 
methods are well known in the art. 

In a particular aspect, the nucleotide sequences specific for aralarl, 
syntaxinIA, or cpo and homologous nucleic acids may be useful in assays 
that detect activation or induction of various metabolic diseases such as 
obesity as well as related disorders such as eating disorder, cachexia, 
diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, gallstones, cancers of 
the reproductive organs, and sleep apnea. The nucleotide sequences may 
be labeled by standard methods, and added to a fluid or tissue sample from 
a patient under conditions suitable for the formation of hybridization 
complexes. After a suitable incubation period, the sample is washed and 
the signal is quantitated and compared with a standard value. If the 
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amount of signal in the biopsied or extracted sample is significantly altered 
from that of a comparable have hybridized with nucleotide sequences in 
the sample, and the presence of altered levels of nucleotide sequences 
encoding aralarl, syntaxinIA, or cpo and homologous proteins in the 
sample indicates the presence of the associated disease. Such assays may 
also be used to evaluate the efficacy of a particular therapeutic treatment 
regimen in animal studies, in clinical trials, or in monitoring the treatment of 
an individual patient. 

In order to provide a basis for the diagnosis of a disease associated with 
expression of aralarl, syntaxinIA, or cpo and homologous proteins, a 
normal or standard profile for expression is established. This may be 
accomplished by combining body fluids or cell extracts taken from normal 
subjects, either animal or human, with a sequence, or a fragment thereof, 
which is specific for aralarl, syntaxinIA, or cpo and homologous nucleic 
acids, under conditions suitable for hybridization or amplification. Standard 
hybridization may be quantified by comparing the values obtained from 
normal subjects with those from an experiment where a known amount of 
a substantially purified polynucleotide is used. Standard values obtained 
from normal samples may be compared with values obtained from samples 
from patients who are symptomatic for disease. Deviation between 
standard and subject values is used to establish the presence of disease. 
Once disease is established and a treatment protocol is initiated, 
hybridization assays may be repeated on a regular basis to evaluate 
whether the level of expression in the patient begins to approximate that, 
which is observed in the normal patient. The results obtained from 
successive assays may be used to show the efficacy of treatment over a 
period ranging from several days to months. 

With respect to metabolic diseases such as obesity as well as related 
disorders such as eating disorder, cachexia, diabetes mellitus, 
hypertension, coronary heart disease, hypercholesterolemia, dyslipidemia, 
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osteoarthritis, gallstones, cancers of the reproductive organs, and sleep 
apnea the presence of a relatively high amount of transcript in biopsied 
tissue from an individual may indicate a predisposition for the development 
of the disease, or may provide a means for detecting the disease prior to 
the appearance of actual clinical symptoms. A more definitive diagnosis of 
this type may allow health professionals to employ preventative measures 
or aggressive treatment earlier thereby preventing the development or 
further progression of the pancreatic diseases and disorders. Additional 
diagnostic uses for oligonucleotides designed from the sequences encoding 
aralarl, syntaxinIA, or cpo and homologous proteins may involve the use 
of PCR. Such oligomers may be chemically synthesized, generated 
enzymatically, or produced from a recombinant source- Oligomers will 
preferably consist of two nucleotide sequences, one with sense orientation 
{5'.fwdarw.3') and another with antisense (3\rarw.5'), employed under 
optimized conditions for identification of a specific gene or condition. The 
same two oligomers, nested sets of oligomers, or even a degenerate pool 
of oligomers may be employed under less stringent conditions for detection 
and/or quantification of closely related DNA or RNA sequences. 

Methods which may also be used to quantitate the expression of aralarl, 
syntaxinIA, or cpo include radiolabeling or biotinylating nucleotides, 
coamplification of a control nucleic acid, and standard curves onto which 
the experimental results are interpolated (Melby r P. C. et al. (1993) J. 
Immunol. Methods, 159:235-244; Duplaa, C. etal. (1993) Anal. Biochem. 
212:229-236). The speed of quantification of multiple samples may be 
accelerated by running the assay in an ELISA format where the oligomer of 
interest is presented in various dilutions and a spectrophotometry or 
colorimetric response gives rapid quantification. 

In another embodiment of the invention, the nucleic acid sequences which 
are sprecific for aralarl, syntaxinIA, or cpo and homologous nucleic acids 
may also be used to generate hybridization probes, which are useful for 
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mapping the naturally occurring genomic sequence. The sequences may be 
mapped to a particular chromosome or to a specific region of the 
chromosome using well known techniques. Such techniques include FISH, 
FACS, or artificial chromosome constructions, such as yeast artificial 
chromosomes, bacterial artificial chromosomes, bacterial P1 constructions 
or single chromosome cDNA libraries as reviewed in Price, C. M. (1993) 
Blood Rev. 7:127-134, and Trask, B. J. (1991) Trends Genet. 7:149-154. 
FISH (as described in Verma et al. (1988) Human Chromosomes: A Manual 
of Basic Techniques, Pergamon Press, New York, N.Y.) may be correlated 
with other physical chromosome mapping techniques and genetic map 
data. Examples of genetic map data can be found in the 1994 Genome 
Issue of Science (265:1 98 If). Correlation between the location of the gene 
encoding aralarl, syntaxinIA, or cpo on a physical chromosomal map and 
a specific disease, or predisposition to a specific disease, may help to 
delimit the region of DNA associated with that genetic disease. 

The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier, or affected 
individuals. In situ hybridization of chromosomal preparations and physical 
mapping techniques such as linkage analysis using established 
chromosomal markers may be used for extending genetic maps. Often the 
placement of a gene on the chromosome of another mammalian species, 
such as mouse, may reveal associated markers even if the number or arm 
of a particular human chromosome is not known. New sequences can be 
assigned to chromosomal arms, or parts thereof, by physical mapping. This 
provides valuable information to investigators searching for disease genes 
using positional cloning or other gene discovery techniques. Once the 
disease or syndrome has been crudely localized by genetic linkage to a 
particular genomic region, for example, AT to 1 1q22-23 (Gatti, R. A. et al. 
(1988) Nature 336:577-580), any sequences mapping to that area may 
represent associated or regulatory genes for further investigation. The 
nucleotide sequences of the subject invention may also be used to detect 
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differences in the chromosomal location due to translocation, inversion, 
etc. among normal, carrier, or affected individuals. 

In another embodiment of the invention, aralarl, syntaxinIA, or cpo and 
homologous proteins, their catalytic or immunogenic fragments or 
oligopeptides thereof, can be used for screening libraries of compounds, 
e.g. peptides or low-molecular weight organic compounds, in any of a 
variety of drug screening techniques. The fragment employed in such 
screening may be free in solution, affixed to a solid support, borne on a cell 
surface, or located intracellular^. The formation of binding complexes, 
between aralarl, syntaxinIA, or cpo and homologous proteins and the 
agent tested, may be measured. 

Another technique for drug screening, which may be used, provides for 
high throughput screening of compounds having suitable binding affinity to 
the protein of interest as described in published PCT application 
WO84/03564. In this method, as applied to aralarl, syntaxinIA, or cpo 
and homologous proteins large numbers of different small test compounds, 
e.g. peptides or low-molecular weight organic compounds are synthesized 
on a solid substrate, such as plastic pins or some other surface. The test 
compounds are reacted with the proteins, or fragments thereof, and 
washed. Bound proteins are then detected by methods well known in the 
art. Purified proteins can also be coated directly onto plates for use in the 
aforementioned drug screening techniques. Alternatively, non-neutralizing 
antibodies can be used to capture the peptide and immobilize it on a solid 
support. In another embodiment, one may use competitive drug screening 
assays in which neutralizing antibodies capable of binding the protein 
specifically compete with a test compound for binding the protein. In this 
manner, the antibodies can be used to detect the presence of any peptide, 
which shares one or more antigenic determinants with the protein. In 
additional embodiments, the nucleotide sequences which are specific for 
aralarl, syntaxinIA, or cpo and homologous nucleic acids or proteins 
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encoded thereby may be used in any molecular biology techniques that 
have yet to be developed, provided the new techniques rely on properties 
of nucleotide that are currently known, including, but not limited to, such 
properties as the triplet genetic code and specific base pair interactions. 

The Figures show: 

FIGURE 1 shows the decrease of triglyceride content of EP(3)3675 flies 
('EP(3)3675\ column 2) caused by homozygous viable integration of the 
P-vector into an intron of CG21 39 gene (in comparison to controls without 
integration of this vector, 'EP control', column 1). 

FIGURE 2 shows the molecular organization of the mutated aralarl (Gadfly 
Accession Number CG2139) gene locus. 

FIGURE 3 shows the BLASTP search results for the CG2139 gene product 
(Query) with the two best human homologous matches (Sbjct). 
FIGURE 3A shows the homology to human protein XP_0.1 0876.3. 
FIGURE 3B shows the homology to human protein NP_055066.1 . 

FIGURE 4 shows the decrease of triglyceride content of EP(3)3215 
CEP(3)3215\ column 2) flies caused by homozygous viable integration of 
the P-vector into an EST-clone (LD43943) that overlaps with CGI 861 5 (in 
comparison to controls without integration of this vector, 'EP control' , 
column 1). 

FIGURE 5 shows the molecular organization of the mutated syntaxinIA 
(Gadfly Accession Number CG18615) gene locus. 

FIGURE 6 shows the BLASTP search result for CGI 861 5 (Query) with the 

best two human homologous matches (Sbjct). 

FIGURE 6A shows the homology to human protein NP_443 106.1 . 
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FIGURE 6B shows the homology to human protein NP003154.1. 

FIGURE 7 shows the increase of triglyceride content of EP(3)0661 
('EP(3)0661/Tm3,Sb' column 2) flies caused by heterozygous lethal 
integration of the P-vector into the promoter of CG1 8434 (in comparison to 
controls without integration of this vector, 'EP-control', column 1). 

FIGURE 8 shows the molecular organization of the mutated cpo (Gadfly 
Accession Number CG 18434) gene locus. 

FIGURE 9A shows the clustal X multiple sequence alignment for CG31 243 
CCG31243-PA') with the two best human homologous matches 
CXP_047075' and 'XP_091O97'). 

FIGURE 9B shows the amino acid sequence encoded by Drosophila gene 
CG31243 (GadFly Accession Number), SEQ ID NO:1 . 

The examples illustrate the invention: 

Example 1: Measurement of triglyceride content 

Mutant flies are obtained from a fly mutation stock collection. The flies are 
grown under standard conditions known to those skilled in the art. In the 
course of the experiment, additional feedings with bakers yeast 
(Saccharomyces cerevisiae) are provided. The average increase or decrease 
of triglyceride content of Drosophila containing the EP-vectors as 
homozygous viable integration were investigated in comparison to control 
flies (see FIGURES 1 , 4, and 7). For determination of triglyceride, flies were 
incubated for 5 min at 90°C in an aqueous buffer using a waterbath, 
followed by hot extraction. After another 5 min incubation at 90°C and 
mild centrifugation, the triglyceride content of the flies extract was 
determined using Sigma Triglyceride (INT 336-10 or -20) assay by 
measuring changes in the optical density according to the manufacturer's 
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protocol. As a reference protein content of the same extract was measured 
using BIO-RAD DC Protein Assay according to the manufacturer's protocol. 
The assay was repeated three times. The average triglyceride level of all 
flies of the EP collection (referred to as 'EP-control') is shown as 100% in 
the first column in FIGURES 1 , 4, and 7, respectively. Standard deviations 
of the measurements are shown as thin bars. 

EP{3)3675 homozygous flies show constantly a lower triglyceride content 
than the controls (30%; column 2 in FIGURE 1, 'EP(3)3675'). Therefore, 
the loss of gene activity in the locus 99F6 on chromosome 3R where the 
EP-vector of EP{3)3675 flies is homozygous viable integrated, is 
responsible for changes in the metabolism of the energy storage 
triglycerides, therefore representing an model for obese flies. The findings 
suggest the presence of similar functions of the homologous proteins in 
humans. 

EP(3)3215 homozygous flies show constantly a lower triglyceride content 
than the controls (28%; column 2 in FIGURE 4, 'EP(3)321 5'): Therefore, 
the loss of gene activity in the locus 95D9 on chromosome 3R where the 
EP-vector of EP{3)3215 flies is homozygous viable integrated, is 
responsible for changes in the metabolism of the energy storage 
triglycerides. 

EP(3)0661 heterozygous flies show constantly a higher triglyceride content 
than the controls (83%; column 2 in FIGURE 7, 'EP(3)0661/TM3,Sb'). 
Therefore, the loss of gene activity in the locus 90D1 on chromosome 3R 
where the EP-vector of EP(3)0661 flies is heterozygos lethal integrated, is 
responsible for changes in the metabolism of the energy storage 
triglycerides. 
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Example 2: Identification of the genes 



Genomic DNA sequences were isolated that are localized to the EP vector 
(herein EP(3)3675) integration. Using those isolated genomic sequences 
public databases like Berkeley Drosophila Genome Project (GadFly) were 
screened thereby confirming the homozygous viable integration site of the 
EP(3)3675 vector into an intron of a Drosophila gene in sense orientation, 
identified as aralarl (GadFly Accession Number CG21 39). FIGURE 2 shows 
the molecular organization of this gene locus. The chromosomal localization 
site of the integration of the vector of EP(3)3675 is at gene locus 3R, 
99F6. In FIGURE 2, genomic DNA sequence is represented by the 
assembly as a scaled black line in the middle, that includes the integration 
site of EP(3)3675. In the upper half of the figure, corresponding BAC 
clones and GenBank units are shown. The insertion site of the P-element in 
Drosophila EP(3)3675 line is shown in the as triangle and labeled with an 
arrow. Black bars in the lower half of the figure, linked by thin black lines 
represent the predicted genes (as predicted by the Berkeley Drosophila 
Genome Project, GadFly and by Magpie). Predicted exons of the Drosophila 
aralarl cDNA (GadFly Accession Number CG2139) are shown as black 
boxes, predicted introns are shown as thin black lines. Transcribed DNA 
sequences (ESTs) are shown as dark grey bars in the below the predicted 
genes line. Therefore, expression of the cDNA encoding aralarl could be 
effected by homozygous integration of vectors of line EP(3)3675, leading 
to decrease of the energy storage triglycerides. 

Genomic DNA sequences were isolated that are localized to the EP vector 
(herein EP(3)3215) integration. Using those isolated genomic sequences 
public databases like Berkeley Drosophila Genome Project (GadFly) were 
screened thereby confirming the homozygous viable integration site of the 
EP(3)3215 vector into the EST clone DGC LD43943 in antisense 
orientation that overlaps with a Drosophila gene, identified as syntaxin 1 A 
(GadFly Accession Number CGI 861 5). FIGURE 5 shows the molecular 
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organization of this gene locus. The chromosomal localization site of the 
integration of the vector of EP(3)3215 is at gene locus 3R, 95D9. In 
FIGURE 5, genomic DNA sequence is represented by the assembly as a 
dotted black line in the middle that includes the integration sites of vector 
for line EP(3)3215. Numbers represent the coordinates of the genomic 
DNA (starting at position 1 9833000 on chromosome 3R, ending at position 
19843000 on chromosome 3R)..The insertion site of the P-element in 
Drosophila EP(3)3215 line is shown in the upper "P-elements" line. 
Predicted genes are shown as bars in the two 'cDNA' lines. Predicted 
exons of the syntaxinl A cDNA (GadFly Accession Number CG1861 5) are 
shown as dark black bars and introns as light grey bars in the lower 
'cDNA' line. Transcribed DNA sequences (ESTs) are shown as grey bars in 
both "EST" lines. Therefore, expression of the cDNA encoding syntaxinl A 
(Accession Number CGI 861 5) could be effected by homozygous 
integration of vectors of line EP(3)3215, leading to decrease of the energy 
storage triglycerides. 

Genomic DNA sequences were isolated that are localized to the EP vector 
(herein EP(3)0661) integration. Using those isolated genomic sequences 
public databases like Berkeley Drosophila Genome Project (GadFly) were 
screened thereby confirming the heterozygous lethal integration site of the 
EP(3)0661 vector into the promoter of RE30936.5 in sense orientation, 
representing an EST-clone of a Drosophila gene, identified as cpo (GadFly 
Accession Number CG1 8434). FIGURE 8 shows the molecular organization 
of this gene locus. The chromosomal localization site of the integration of 
the vector of EP(3)0661 is at gene locus 3R, 90D1 . In FIGURE 8, genomic 
DNA sequence is represented by the assembly as a thin black scaled line 
that includes the integration sites of vector for line EP(3)0661. Numbers 
represent the length in basepairs of the genomic DNA. In the upmost line 
of the figure, a corresponding BAC clone is shown. The insertion site of 
the P-element in Drosophila EP(3)0661 line is shown as triangle and labeled 
with an arrow. Predicted genes are shown as labeled bars, linked by thin 
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lines. Predicted exons of the cpo cDNA (GadFly Accession Number 
CQ 18434) are shown as black bars and are linked by introns, shown as 
light grey lines. Transcribed DNA sequences (ESTs) are shown as light grey 
bars in the upper part of the figure. Therefore, expression of the cDNA 
encoding cpo (Accession Number CG 18434) could be effected by 
homozygous integration of vectors of line EP(3)0661, leading to increase 
of the energy storage triglycerides. 

Example 3: Identification of human aralarl, syntaxinIA, and cpo 
homologues 



Aralarl, syntaxinIA, and cpo homologous proteins and nucleic acid 
molecules coding therefore are obtainable from insect or vertebrate 
species, e.g. mammals or birds. Particularly preferred are nucleic acids 
comprising aralarl (GadFly Accession Number CG2139), a human solute 
carrier family 25 {mitochondrial carrier, Aralar), member 12 (GenBank 
Accession Number XP_01 0876.3 for the protein, XM_010876 for the 
cDNA), a human solute carrier family 25, member 13 (citrin) (GenBank 
Accession Number NP_055066.1 for the protein, NM_014251 for the 
cDNA), syntaxinIA (GadFly Accession Number CG18615), a human 
syntaxin 1B2 (GenBank Accession Number NP_443106.1 for the protein, 
NM_052874 for the cDNA), a human syntaxin 1B (GenBank Accession 
Number NP_003154.1 for the protein, NM_003163 for the cDNA), cpo 
(GadFly Accession Number CG18434), SEQ ID NO:1 (GadFly Accession 
Number CG31243), a human RNA-binding protein gene with multiple 
splicing (RBPMS; GenBank Accession Number XP_047075.1 for the 
protein, XM_047075 for the cDNA), and a human gene similar to 
RNA-binding protein with multiple splicing (RBP-MS; GenBank Accession 
Number XP_091097 for the protein, XM_091097 for the cDNA). 

As shown in FIGURE 3A and 3B, gene product of GadFly Accession 
Number CG2139 is 74% homologous to human solute carrier family 25 
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(mitochondrial carrier, Aralar),. member 12 (GenBank Accession Number 
XPJD1 0876.3) and 73% homologous to human solute carrier family 25, 
member 13 (citrin) (GenBank Accession Number NP_055066.1). CG2139 
also shows 73% homology on protein level to mouse solute carrier family 
25 (mitochondrial carrier; adenine nucleotide translocator), member 13 
(GenBank Accession Number NP_056644.1). 

As shown in FIGURE 6A and, 6B, gene product of GadFly Accession 
Number CG18615 is 83% homologous to human syntaxin 1B2 (GenBank 
Accession Number NP443106.1) and 78% homologous to human 
syntaxin 1B (GenBank Accession Number NP_003154.1). CG18615 also 
shows 82% homology on protein level to mouse syntaxin 1B (GenBank 
Accession Number NP_077725). 

The novel gene CG31243 comprises the coding sequence of genes 
CG 18434 (cpo) and CG 18435, as shown in Berkeley Drosophila Genome 
Project, predicted proteins Version 3. As shown in FIGURE 9A, the gene 
product of Drosophila CG31243 is 62% homologous to human 
RNA-binding protein with multiple splicing (GenBank Accession Number 
XP_047O75.1), and 59% homologous to human protein similar to 
RNA-binding protein with multiple splicing (GenBank Accession Number 
XP_091097)at the C-terminal part, respectively. 

Example 4: Genetic adipose pathway screen 

Adipose (adp) is a protein that has been described as regulating, causing or 
contributing to obesity in an animal or human (see WO 01/96371). 
Transgenic flies containing a wild type copy of the adipose cDNA under the 
control of the Gal4/UAS system were generated (Brand and Perrimon, 
1993, Development 1 1 8:401-41 5; for adipose cDNA, see WO 01/96371). 
Chromosomal recombination of these transgenic flies with an eyeless-Gal4 
driver line has been used to generate a stable recombinant fly line 
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over-expressing adipose in the developing Drosophila eye. Animals 
receiving transgenic adipose activity under these conditions developed into 
adult flies with a visible change of eye phenotype. Virgins of the 
recombinant driver line were crossed with males of the mutant EP-line 
collection in single crosses and kept for preferably 1 2 to 15 days at 29°C. 
The offspring was checked for modifications of the eye phenotype 
(enhancement or suppression). Mutations changing the eye phenotype 
affect genes that modify adipose activity. The inventors have found that 
the fly line HD-EP(3)35715 is a suppressor of the eye-adp-Gal4 induced 
eye phenotype. This result is strongly suggesting an interaction of the cpo 
gene with adipose since the integration of HD-EP(3)3571 5 was found to be 
located at the cpo locus. This is supporting the function of cpo and 
homologous proteins in the regulation of the energy homeostasis. 

Example 5: dUCPy modifier screen 

Expression of Drosophila uncoupling protein dUCPy in a non-vital organ like 
the eye (Gal4 under control of the eye-specific promoter of the "eyeless" 
gene) results in flies with visibly damaged eyes. This easily visible eye 
phenotype is the basis of a genetic screen for gene products that can 
modify UCP activity. 

Parts of the genomes of the strain with Gal4 expression in the eye and the 
strain carrying the pUAST-dUCPy construct were combined on one 
chromosome using genomic recombination. The resulting fly strain has 
eyes that are permanently damaged by dUCPy expression. Flies of this 
strain were crossed with flies of a large collection of mutagenized fly 
strains. In this mutant collection a special expression system (EP-element, 
Ref.: Rorth P, Proc Natl Acad Sci U S A 1996, 93(22):1 241 8-22) is 
integrated randomly in different genomic loci. The yeast transcription factor 
Gal4 can bind to the EP-element and activate the transcription of 
endogenous genes close the integration site of the EP-element. The 
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activation of the genes therefore occurs in the same cells (eye) that 
overexpress dUCPy. Since the mutant collection contains several thousand 
strains with different integration sites of the EP-element it is possible to 
test a large number of genes whether their expression interacts with 
dUCPy activity. In case a gene acts as an enhancer of UCP activity the eye 
defect will be worsened; a suppressor will ameliorate the defect. 

Using this screen a gene with suppressing activity was discovered that 
was found to be the cpo gene in Drosophila. 
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Claims 



A pharmaceutical composition comprising a nucleic acid molecule of 
the aralarl, syntaxinlA, or cpo gene family or a polypeptide 
encoded thereby or a fragment or a variant of said nucleic acid 
molecule or said polypeptide or an antibody, an aptamer or another 
receptor recognizing a nucleic acid molecule of the aralarl, 
syntaxinlA, or cpo gene family or a polypeptide encoded thereby 
together with pharmaceutical^ acceptable carriers, diluents and/or 
adjuvants. 

The composition of claim 1, wherein the nucleic acid molecule is a 
vertebrate or insect aralarl, syntaxinlA, or cpo nucleic acid, 
particulary encoding a human solute carrier family 25 (mitochondrial 
carrier, Aralar), member 12 protein (GenBank Accession Number 
XP_0 10876.3 for the protein, XMJ) 10876 for the cDNA), a human 
solute carrier family 25, member 13 protein (citrin) (GenBank 
Accession Number NP_055066.1 for the protein, NM_0.1 4251 for 
the cDNA), a human syntaxin 1B2 protein (GenBank Accession 
Number NP_443106.1 for the protein, NM_052874 for the cDNA), 
a human syntaxin 1B protein (GenBank Accession Number 
NP_003154.1 for the protein, NM_003163 for the cDNA), a human 
RNA-binding protein with multiple splicing (RBPMS; GenBank 
Accession Number XP_047075.1 for the protein, XM_047075 for 
the cDNA), or a human protein similar to RNA-binding protein with 
multiple splicing (RBP-MS; GenBank Accession Number XP_091097 
for the protein, XM_091097 for the cDNA), and/or a nucleic 
molecule which is complementary thereto, or a fragment thereof or 
a variant thereof. 



The composition of claim 1 or 2, wherein said nucleic acid molecule 
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(a) hybridizes at 50° in a solution containing 1 x SSC and 0.1% 
SDS to a nucleic acid molecule as defined in claim 2 and/or a 
nucleic acid molecule which is complementary thereto; 

(b) it is degenerate with respect to the nucleic acid molecule of 
(a), 

(c) encodes a polypeptide which is at least 85%, preferably at 
least 90%, more preferably at least 95%, more preferably at 
least 98% and up to 99,6% identical to a human solute 
carrier family 25 (mitochondrial carrier, Aralar), member 12 
protein (GenBank Accession Number XP 0 10876.3 for the 
protein, XM_010876 for the cDNA), a human solute carrier 
family 25, member 13 protein (citrin) (GenBank Accession 
Number NP_055066.1 for the protein, NM_014251 for the 
cDNA), a human syntaxin 1B2 protein (GenBank Accession 
Number NP443106.1 for the protein, NM_052874 for the 
cDNA), a human syntaxin IB protein (GenBank Accession 
Number NP_003154.1 for the protein, NM_003163 for the 
cDNA), a human RNA-binding protein with multiple splicing 
(RBPMS; GenBank Accession Number XP047075.1 for the 
protein, XM 047075 for the cDNA), or a human protein 
similar to RNA-binding protein with multiple splicing (RBP-MS; 
GenBank Accession Number XPJ391097 for the protein, 
XM_091097 for the cDNA), as defined in claim 2; 

(d) differs from the nucleic acid molecule of (a) to (c) by mutation 
and wherein said mutation causes an alteration, deletion, 
duplication or premature stop in the encoded polypeptide. 

The composition of any one of claims 1 -3, wherein the nucleic acid 
molecule is a DNA molecule, particularly a cDNA or a genomic DNA. 



10 



- 57 - 

5. The composition of any one of claims 1-4, wherein said nucleic acid 
encodes a polypeptide contributing to regulating the energy 
homeostasis and/or the metabolism of triglycerides. 

6. The composition of any one of claims 1-5, wherein said nucleic acid 
molecule is a recombinant nucleic acid molecule. 

7. The composition of any one of claims 1-6, wherein the nucleic acid 
molecule is a vector, particularly an expression vector. 

8. The composition of any one of claims 1-5, wherein the polypeptide 
is a recombinant polypeptide. 



9. The composition of claim 8, wherein said recombinant polypeptide is 
15 a fusion polypeptide. 

10. The composition of any one of claims 1-7, wherein said nucleic acid 
molecule is selected from hybridization probes, primers and 
anti-sense oligonucleotides. 

20 

11. The composition of any one of claims 1-10 which is a diagnostic 
composition. 

12. The composition of any one of claims 1-10 which is a therapeutic 
25 composition. 

13. The composition of any one of claims 1-12 for the macpoacture of 
an agent for detecting and/or verifying, for the treatment, alleviation 
and/or prevention of an disorders, including metabolic diseases such 

30 as obesity and other body-weight regulation disorders as well as 

related disorders such as eating disorder, cachexia, diabetes 
mellitus, hypertension, coronary heart disease, 
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hypercholesterolemia, dyslipidemia, osteoarthritis, gallstones, 
cancer, e.g. cancers of the reproductive organs, and sleep apnea 
and others, in cells, cell masses, organs and/or subjects. 

5 14. Use of a nucleic acid molecule of the aralarl, syntaxinlA, or cpo 
gene family or a polypeptide encoded thereby or a fragment or a 
variant of said nucleic acid molecule or said polypeptide or an 
antibody, an aptamer or another receptor recognizing a nucleic acid 
molecule of the aralarl, syntaxinlA, or cpo gene family or a 

io polypeptide encoded thereby for controlling the function of a gene 

and/or a gene product which is influenced and/or modified by an 
aralarl, syntaxinlA, or cpo homologous polypeptide. 



1 5. Use of the nucleic acid molecule of the aralarl , syntaxinlA, or cpo 
15 gene family or a polypeptide encoded thereby or a fragment or a 

variant of said nucleic acid molecule or said polypeptide or an 
antibody, an aptamer or another receptor recognizing a nucleic acid 
molecule of the aralarl, syntaxinlA, or cpo gene family or a 
polypeptide encoded thereby for identifying substances capable of 
20 interacting with an aralarl, syntaxinlA, or cpo homologous 

polypeptide. 

16. A non-human transgenic animal exhibiting a modified expression of 
an aralarl, syntaxinlA, or cpo homologous polypeptide. 

25 

17. The animal of claim 16, wherein the expression of the aralarl, 
syntaxinlA, or cpo homologous polypeptide is increased and/or 
reduced. 



30 18. 



A recombinant host cell exhibiting a modified expression of an 
aralarl, syntaxinlA, or cpo homologous polypeptide. 
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19. The cell of claim 1 8 which is a human cell. 

20. A method of identifying a (poly)peptide involved in the regulation of 
energy homeostasis and/or metabolism of triglycerides in a mammal 

5 comprising the steps of 

(a) contacting a collection of (poly)peptides with an aralarl, 
syntaxinIA, or cpo homologous polypeptide or a fragment 
thereof under conditions that allow binding of said 
(poly)peptides; 

10 (b) removing (poly) peptides which do not bind and 

(c) identifying (poly) peptides that bind to said aralarl, 
syntaxinIA, or cpo homologous polypeptide. 

21 . A method of screening for an agent which modulates the interaction 
15 of an aralarl, syntaxinIA, or cpo homologous polypeptide with a 

binding target/agent, comprising the steps of 

(a) incubating a mixture comprising 

(aa) an aralarl, syntaxinIA, or cpo homologous 
polypeptide, or a fragment thereof; 
20 (ab) a binding target/agent of said aralarl, syntaxinIA, or 

cpo homologous polypeptide or fragment thereof; and 
(ac) a candidate agent 

under conditions whereby said aralarl, syntaxinIA, or cpo 
polypeptide or fragment thereof specifically binds to said 
25 binding target/agent at a reference affinity; 

(b) detecting the binding affinity of said aralarl, syntaxinIA, or 
cpo polypeptide or fragment thereof to said binding target to 
determine an (candidate) agent-biased affinity; and 

(c) determining a difference between (candidate) agent-biased 
30 affinity and the reference affinity. 
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22. A method of producing a composition comprising the (poly)peptide 
identified by the method of claim 20 or the agent identified by the 
method of claim 21 with a pharmaceutical^ acceptable carrier, 
diluent and/or adjuvant. 

5 

23. The method of claim 22 wherein said composition is a 
pharmaceutical composition for preventing, alleviating or treating of 
diseases and disorders, including metabolic diseases such as obesity 
and other body-weight regulation disorders as well as related 

10 disorders such as eating disorder, cachexia, diabetes mellitus, 

hypertension, coronary heart disease, hypercholesterolemia, 
dyslipidemia, osteoarthritis, gallstones, cancer, e.g. cancers of the 
reproductive organs, and sleep apnea and other diseases and 
disorders. 

15 

24. Use of a (poly)peptide as identified by the method of claim 20 or of 
an agent as identified by the method of claim 21 for the preparation 
of a pharmaceutical composition for the treatment, alleviation and/or 
prevention of of diseases and disorders, including metabolic diseases 

20 such as obesity and other body-weight regulation disorders as well 

as related disorders such as eating disorder, cachexia, diabetes 
mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, gallstones, 
cancer, e.g. cancers of the reproductive organs, and sleep apnea 

25 and other diseases and disorders. 

25. Use of a nucleic acid molecule of the aralarl, syntaxinIA, or cpo 
family or of a fragment thereof for the preparation of a non-human 
animal which over- or under-expresses the aralarl, syntaxinIA, or 

30 cpo gene product. 



26. 



Kit comprising at least one of 
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(a) an aralarl, syntaxinlA, or cpo nucleic acid molecule or a 
fragment thereof; 

(b) a vector comprising the nucleic acid of (a); 

(c) a host cell' comprising the nucleic acid of (a) or the vector of 
(b); 

(d) a polypeptide encoded by the nucleic acid of (a); 

(e) a fusion polypeptide encoded by the nucleic acid of (a); 

(f) an antibody, an aptamer or another receptor against the 
nucleic acid of (a) or the polypeptide of (d) or (e) and 

(g) an anti-sense oligonucleotide of the nucleic acid of (a). 
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Abstract 



The present invention discloses aralarl, syntaxinIA, or cpo homologous 
proteins regulating the energy homeostasis and the metabolism of 
triglycerides, and polynucleotides, which identify and encode the proteins 
disclosed in this invention. The invention also relates to the use of these 
sequences in the diagnosis, study, prevention, and treatment of diseases 
and disorders, for example, but not limited to, metabolic diseases such as 
obesity as well as related disorders such as eating disorder, cachexia, 
diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, gallstones, cancers of 
the reproductive organs, and sleep apnea. 
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FIGURE 1. Triglyceride content of a Drosophila aralar 1 (Gad Fly Accession 
Number CG21 39) mutant 
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FIGURE 2. Molecular organisation of the aralar 1 gene (GadFly Accession 
Number CG21 39) 
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FIGURE 3. BLASTP results for aralar 1 (GadFly Accession Number CG2139) 

FIGURE 3A. Homology to human protein XP_010876.3 (GenBank Accession 
Number) 

ref |XP_010876.3| (X**_010876) solute carrier family 25 (mitochondrial carrier, Aralar) , 
member 12 [Homo sapiens] 
Length =678 

Score = 741 bits (1913), Expect - 0.0 <ii/lefl 
Identities = 382/650 (58%) , Positives = 488/650 (74%), Gaps = 14/650 (2%) 

Query 1 MT SEDFVRKFLGLF SE S AFNDESVRLLANI ADT SKDGL I SF SEFQ AFEGLLCTPDALYRT 60 

MT EDFV+++LGL+++ N + V+LLA +AD +KDGI1IS+ EF AFE +LC PD+++ 
Sbjct: 34 MTPEDFVQRYLGLYNDPNSNPKIVQLLAGV^ 93 

Query 61 AFQLFDRKGNGTVSYADF ADWQKTELHSKIPF SLDGPF I KRYFGDKKQRL INYAEFTQL 120 

AFQLFD+ GNG V++ + ++ +T +H IPF+ D FI+ +FG +++ +NY EFTQ 
Sbjct: 94 AFQLFDKSGNGEVTFENVKEIFGQTI IHHHIPFNWDCEF IRLHFGHNRKKHLNYTEFTQF 153 

Query: 121 LHDFHEEHAMEAFRSKDPAGTGFISPLDFQDIIVNVKRHLLTPGVRDNLVSVTEG HK 177 

L + EHA +AF KD + +G IS LDF DI+V ++ H+LTP V +NLVS G H+ 
Sbjct: 154 LQELQLEHARQ AF AliKDKSKSGMI SGLDF SDIMVT IRSHMLTPF VEENLVS AAGGSI SHQ 213 

Query 178 VSFPYFIAFTSLLNNMELIKQVYLHATEGSRTDM-ITK^ 236 

VSF YF AF SLLNNMEL+++ +Y G+R D+ +TK++ +A Q+TPLEIDIL+ 

Sbjct: 214 VSF SYFNAFNSIiIjNNMEILVRKIY- STLAGTRKDVEVTKEEFAQSAIRYGQVTPLEIDIIjY 272 

Query 237 HLAGAVHQAGRIDYSDLSNIAPEHCTKHMT^^ 295 

LA + +GR+ +D+ IAP + + LAE++ +SP R ++Q+ ES+YRFT 

Sbjct: 273 QLADLYNASGRLTI*ADIERIAPI*AEGA-LPYNl#AEIjQRQQSPGIiGRPIWI*QIAESAYRFT 331 

Query: 296 LGSFAGAVGATWYPIDLVKTRMQNQR-AGSYIGEVAYR^ 354 

LGS AGAVGAT VYPIDLVKTRMQNQR +GS +GE+ Y+NS+DCFKKV+R+EGF GLYRG 
Sbjct: 332 I/SSVAGAVGATAVYPIDLVKTRMQNQRGSGSWGEIiMY 391 

Query 355 LI*PQLMGVAPBKAIKI*TVNDI»VRDKIiTDKKGNI PTWAEVIiAGGCAGASQVVFTNPLEIVK 414 

L+PQL+GVAPEKAIKLTVND VRDK T' + G++P AEVLAGGCAG SQV+FTNPLEIVK 
Sbjct: 392 LIPQLIO^APEKAIKLTViroFVRDKFTRR^ 451 

Query 415 IRLQVAGE I ASG SKIRAWS WRELGLFGLYKG ARACLIJRDVPF SAI YFPTYAHTKAMMAD 474 

IRLQVAGEI +G ++ A +V+R+LG+FGLYKGA+AC LRD+PFSAIYFP YAH K ++AD 
Sbjct: 452 IRLQVAGE ITTGPRVS ALNVLRDLGI FGLYKGAKACFLRDI PFS AI YFPVYAHCKLLLAD 511 

Query 475 KDGYNHPLTUJAAGAIAGTOAASLVTPAI^ 534 

++G+ I* LIiAAGA+AGVPAASLVTPADVIKTRLQV AR+GQTTY+GV D +KI+ EE 
Sbjct: 512 ENGHVGGLNLLAAGAMAGVPAASLVTPADVIKTRL^ 571 

Query 535 GPRAFWKGTAARVFRSSPQFGWLVTYELLQRLFYVDFGGTQPKGSEAHKITTPIiEQAAA 594 

GP AFWKGTAARVFRS SPQFGVTLVTYELLQR FY+DFGG +P GSE TP + A 

Sbjct: 572 GPSAFWKGTAARVFRSSPQFGVTLVTYELLQRWFYIDFGGIiKPAGSE PTP-KSRIA 626 

Query: 595 SVTTENVDHIGGYRAAVPLLAGVESKFGLYLPRF-GRGVTAASPSTATGS 643 

+ N DHIGGYR A AG+E+KFGLYLP+F V P A + 
Sbjct: 627 DLPPANPDHIGGYRIiATATFAGIENKFGLYLPKFKSPSVAWQPKAAVAA 676 



FIGURE 3B. Homology to human protein NP_055066.1 (GenBank Accession 



ref jNP_055066.1| <NH_014251) solute carrier family 25, member 13 (citrin) [Homo 

sapxens] 

Length =675 

Score = 728 bits (1878), Expect = 0.0 

Identities = 374/643 <58%), Positives a 476/643 (73%), Gaps = 17/643 (2%) 

Query: 1 OTSEDFVIUCFLGLFSESAFiro^ 60 

M+ DFV ++L +F ES N ++V LL+ + D +KDGLISF EP APE +LC PDAL+ 
Sbjct: 35 MSPNDFVTRYLNIFGESQPNPKTVELIjSGVVI^TKIXSLISFQEFVAFESVI^ 94 

Query: 61 AF QLFDRKGNGTVS YADF ADWQKTELH SKI PF SLDGPF IKRYFGDKKQRL INYAEFTQL 120 

AFQLFD+ G G V++ D V +T +H IPF+ D F++ +FG +++R + YAEFTQ 
Sbjct: 95 AFQLFDKAGKGEVTFEDVKQVFGQTTIHQHIPFNWDSEFVQLHFGKERKRHLTYAEFTQF 154 

Query: 121 LHDFHEEHAME AFRSKDPAGTGF I SPLDFQDI I VNVKRHLLTPGVRDNLVSVTEG ~--HK 177 

L + EHA +AF +D A TG ++ +DF+DI+V ++ H+LTP V + LV+ G H+ 
Sbjct: 155 LLEI QLEHAKQAFVQRDNARTGRVTAIDFRDIMVT IRPHVLT PFVEECLVAAAGGTT SHQ 214 

Query: 178 VSFPYFIAFTSIiliNNMELIKQVYLHATEGSRTDM-ITKDQILLAAQTMSQITPLEIDIL^ 236 

VSF Yp p SLLNNMELI+++Y G+R D+ +TK++ +LAAQ Q+TP+E+.DILF 

Sbjct: 215 VSFSYFNGFNSIJJl^ELIRKIY-STLAGTRKDVEVTKEEFVI^QKFGQVTPMEVDILF 273 

Query: 237 HLAGAVHQAGRIDYSDLSNIAP - EHYTKHMTHRLAEIKAVESPAD — RSAFIQVLESSYR 293 

LA GR+ +D+ IAP E T + LAE + ++ D R +QV ES+YR 

Sbjct: 274 QLADLYEPRGRMTLADIERIAPLEEGT — LPFNLAEAQRQKASGDSARPVLLQVAESAYR 331 

Query: 294 FTIX3SFAGAVGATWYPIDLVKTRMQNQRA-GSYIGEVAYRNSWDCFKKVVRHEGFMGLY 352 

F LGS AGAVGAT VYPIDLVKTRMQNQR+ GS++GE+ Y+NS+DCFKKV+R+EGF GLY 
Sbjct: 332 FGLGSVAGAVGATAVYPIDLVKTRMQNQRSTGSFVGEJjMYKNSFDCFKKVLRYEGFFGLY 391 

Query: 353 RGLLPQI^GVAPEKAIKLTVNDLVRDKLTDKKGNI PTWAEVLAGGCAGASQWFTNPLEI 412 

RGLLPQL+GVAPEKAIKLTVND VRDK K G++P AE+LAGGCAG SQV+FTNPLEI 
Sbjct: 392 RGLLPQLLGVAPEKAIKLTVNDF VRDKFMHKDGS VPLAAE ILAGGC AGG SQVT F TNPLEI 451 

Query: 413 VKIRLQVAGE I ASGSK IRAWSWRELGLFGLYKGARACLLRDVPF S AI YFPTY AHTKAMM 472 

VKIRLQVAGEI +G ++ A SWR+LG FG+YKGA+AC LRD+PFSAIYFP YAH KA 
Sbjct: 452 WIRLQVAGEITTGPRVSALSVVRDLGFFGIYKGAKACFLRDIPFSAIYFPCYAHVKASF 511 

Query: 473 ADKDGYNHPLTLLAAGAIAGVPAASLOTPADTO 532 

A++DG P +LL AGAIAG+PAASLVTPADVIKTRLQV AR+GQTTY+GV D +.KI+ 
Sbjct: 512 ANEDGQVSPGSLLLAGAIAGMPAASLVTPADVIKTRLQVAARAGQTTYSGVIDCFRKILR 571 

Query: 533 EEGPRAFWKGTAARVFRSSPQFGVTLVTYELLQRLFYVDFGGTQPKGSEAHKITTPLEQA 592 

EEGP+A WKG ARVFRS SPQFGVTL+TYELLQR FY+DFGG +P GSE P+ ++ 

Sbjct: 572 EEGPKALWKGAGARWRSSPQFGVTLLTYELLQRWFYIDFGGVKPMGSE PVPKS 625 

Query: 593 AASVTTENVDHIGGYRAAVPLLAGVESKFGLYLPRFGRGVTAA 635 

++ N DH+GGY+ AV AG+E+KFGLYLP F V+ + 
Sbjct: 626 RINLPAPNPDHVGGYKLAVATFAGIENKFGLYLPLFKPSVSTS 668 . 



Number) 




FIGURE 4. Triglyceride content of a Drosophila SyxIA (GadFly Accession 
Number CG18615) mutant 




EP-control EP(3)3215 



FIGURE 5. Molecular organisation of the SyxIA gene (GadFly Accession 
Number CG1861 5) 




Legend :«G<«fFiti, oec, com ■ Hojpu, clot ■ est 




FIGURE 6. BLASTP results for SyxIA (GadFly Accession Number CG18615) 

FIGURE 6A. Homology to human protein NP_443106.1 (GenBank Accession 
Number) 

ref |NP_443106.l| (NH_052874) syntaxinlB2 [Homo sapiens] 
Length = 288 

Score ■ 385 bits (988) , Expect = e-106 

Identities = 196/284 (69%), Positives =* 234/284 (82%), Gaps = 3/284 (1%) 

Query: 3 KDRIiAALHAAQ SDDEEETEVAVNVDGHD S YMDDFF AQVEE IRGMIDKVQDNVEEVKKKHS 62 

KDR L +A+ D+EE V V+ D +MD+FF QVEEIRG I+K+ + +VE+ VKK+HS 
Sbjct: 2 KDRTQELRSAKDSDDEEEWHVD RDHFMDEFFEQVEEIRGCIEKLSEDVEQVKKQHS 58 

Query: 63 AILSAPQTDEKTKQELEDLMADIKKNANRVRGKLKGIEQNIEQEEQQNKSSADLRIRKTQ 122 

AIL+AP DEKTKQELEDL AD IKK AN+VR KLK IEQ+IEQEE N+SSADLRIRKTQ 
Sbjct: 59 AILAAPNPDEKTKQELEDLTADIKKTANKVRSKLKAIEQSIEQEEGLNRSSADLRIRKTQ 118 

Query: 123 HSTLSRKFVEVMTEVNRTQTDYRERCKGRIQRQIiEITGRPTNDDELEKMLEEGNSSVFTQ 182 

HSTLSRKFVEVMTEYN TQ+ YR+RCK RIQRQLEITGR T ++ELE MLE G ++FT 
Sbjct: 119 H STL SRKFVEVMTEYNATQ SKYRDRCKDRI QRQLE ITGRTTTNEELEDMLE SGKIiAI FTD 178 

Query: 183 GIIMETQQAKQTLADIEARHQDIMKLETSIKELHDM^ 242 

I M++Q KQ L +IE RH +I+KLETSI+ELHDMF+DMAMLVESQGEMIDRIEY+VEH 
Sbjct: 179 DIKMDSQMTKQALNEIETRHNEI IKLETS I RELHDMFVDMAMLVE SQGEMIDRI EYNVEH 238 

Query: 243 AMDWQTATQDTKKALKYQSKARRKKIMILICLTVLGILAASYV 286 

++DYV+ A DTKKA+KYQSKARRKKIMI+IC VLG++ AS + 
Sbjct: 239 SVDYVERAVSDTKKAVKYQSKARRKKIMIIICCWI/5VVLASSI 282 



FIGURE 6B. Homology to human protein NP 003154.1 (GenBank Accession 

Number) 

ref |NP_003154.1| (NM_003163) syntaxin IB [Homo sapiens] 
Length = 288 

Score = 364 bits (934) , Expect = e-100 

Identities « 186/284 (65%) , Positives = 225/284 (78%) , Gaps = 3/284 (1%) 

Query: 3 KDRLAALHAAQSDDEEETEVAV1WDGHDSYMDDFF AQVEE IRGMIDKVQDNVEEVKKKHS 62 

KDR L ++ D++E V V+ D +MD+FF Q EEIRG I+K+ + +VE+ VKK+HS 
Sbjct: 2 KDRTQ VLRTRRNS DDKEE WHVD RDHFMDEFFEQEEEIRGCIEKLSEDVEQVKKQHS 58 

Query: 63 AILSAPQTDEKTKQELEDLMADIKKNANRVRGKLKGIEQNIEQEEQQNKSSADLRIRKTQ 122 

AIL+AP DE+TKQELEDL ADIKK AN+VR KLK IEQ+IEQEE LRIRKTQ 
Sbjct: 59 AILAAPNPDERTKQELEDLTADIKKTANKVRSKLKAIEQSIEQEEGSTAPRPILRIRKTQ 118 

Query: 123 HSTLSRKFVEVMTEYNRTQTDYRERCKGRIQRQLEITGRPTNDDELEKMLEEGNSSVFTQ 182 

HSTLSRKFVEVMTEYN TQ+ YR+RCK RIQRQLEITGR T ++ELE MLE G +FT 
Sbjct: 119 HSTL SRKFVEVMTEYNATQ SKYRDRCKDRIQRQLEITGRTTTNEELEDMLESGKLPI FTD 178 

Query: 183 G I IMETQQAKQTLADI EARHQDIMKLET SIKELHDMFMDMAMLVE SQ GEMIDRI EYHVEH 242 

I M++Q KQ L +IE RH + I+KLET SI +ELHDMF+DMAMLVESQGEMIDRI EY+VEH 
Sbjct: 179 DIKMDSQMTKQALNEI ETRHNEI IKLETSIRELHDMFVDMAMLVESQGEMIDRI EYNVEH 238 

Query: 243 AMDYVQTATQDTKKALKYQSKARRKKIMILICLTVLGILAASYV 286 

++DYV+ A DTKKA+KYQSKARRKKI+I+IC VLG++ AS + 
Sbjct: 239 SVDYVERAVSDTKKAVKYQSKARRIOCIIIIICCV^^ 282 




FIGURE 7. Triglyceride content of a Drosophila cpo (GadFly Accession Number 
CG18434) mutant 




EP-control EP(3)0681/TM3.Sb 



FIGURE 8. Molecular organisation of the cpo gene (GadFly Accession Number 
CG18434) 
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FIGURE 9A. CLUSTAL X (1.81) multiple sequence alignment 



XP_047075 
XP_091097 
CG31243-PA 



IjVKI ANYQDLLGSHHQLL I AATAAAAAAAAAEPQLQLQHLLPAAPTTPAVI SNP XNSIGP 



XP_047075 
XP_091097 
CG31243-PA 



INQISSSSHPSNNNQQAVFEKAITISSIAIKRRPTLPQTPASAPQVIjSPSPKRQCAAAVS 



XP_J)47075 
XP_091097 
CG31243-PA 



VLPVWPVPVPVSVPLPVSVPVPVSVKGHPISHTHQIAHTHQISHSHPISHPHHHQLSFA 



XP_047075 
XP_091097 
CG31243-PA 



HPTQPAAAVAAHHQQQQQQQAQQQQQAVQQQQQQAVQQQQVAYAVAASPQLQQQQQQQQH 



XP_047075 
XP_091097 
CG31243-PA 



XP_J)47075 
XP_091097 
CG31243-PA 



MSN 



RLAQFNQAAAAALLNQHLQQQHQAQQQQHQAQQQSLAHYGGYQLHRYAPQQQQQHILLSS 



LKPDGEHGGSTGTGSGAGSGGALEEEVGLWPRDLPGGRGRGRAGPAAPRGAGVAIAPGAF 
GSSSSKHNSNmSNTSAGAASAAVPIATSVAAVPTTGGSIiPDSPAHESHSHESNSATASA 



XP_047075 
XP_091097 
CG31243-PA 



MNNGGKAEK ENTp _ 

PTTPSPAGSVTSAAPTATATAAAAGSAAATAAATGTPATSAVSDSNNNLNSSSSSNSNSN 



XP_047075 
XP_091097 
CG31243-PA 



XP_047075 
XP_091097 
CG31243-PA 



XP_047075 
XP_091097 
CG31243-PA 



XP_047075 
XP_091097 
CG31243-PA 



SEANLQEEE VRTLFVSGLPLDIKPRELYLLFRPFKGY 

PCHIUIHHHHTYPNLLQDTVSVIjFVHELKTRPGVRTLFVSGLPVDIKPRELYLLFRPFKGY 

AIMENQMALAPLGLSQ SMDSVN TASNEEEVRTLFVSGLPMDAKPRELYLLFRAYEGY 

: .* :. : **********.* ********** ^ . , 

EGSLIKLTSKQ PVGF VSFDSRSEAEAAKNALNGIRFDPE I PQTLRLEFAKANTKM 

EGSLIKLTARQ PVGFVIFDSRAGAEAAKNALNGIRFDPENPQTLRLEFAKANTKM 

EGSLLKVT SKNGKT A S PVGFVTFHTRAGAE AAKQDLQGVRFDPDMPQT IRLEFAKSNTKV 
****:*;*:•...-::***** *..*• **★**. *.*.****. ***.******.***. 

AKNKLVGTPNPST_PI#PNTVPQFIAREPYELTVPALYPSSPEVWAPYPLYPAELAPAI*PPP 
AKSKI*MATPNPSNVHPALGAHF I ARDPYDLMGAALI PASPEAWAPYPLYTTELTPAI SHA 

SKPKPQPNTATTASHPALMHPLTG HLGGPFFPGGPELWHHPLAYSAAAAAELPGA 

: * * . . . : * : - .:**** * . 



AFTYP ASLHAQ CFSPEAKPN 

AFTYPTATAAAAALHAQRRHIRQCTPTCRIEKLMLKGLVTGEVVIjVTAPRLTPS 
AALQH ATLV 
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TPVFCPLLQQIRFVSG NVFVTYQPTADQQRELP C 

TGRLGSKMSVLVWASAGDPGALREEEEEPGQDQALQKAARYPQRC 
HPALHPQVPVRSYL 
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FIGURE 9B. Amino acid sequence encoded by Drosophila gene CG31243 
(GadFly Accession Number), SEQ ID NO:1 

>CG31243-PA (AE003720) [gene_syn=CG31243 J [prot_desc=CG3 1243 gene product from 
transcript CG31243-RA] 

LVKIANYQDLLGSHHQIit.IAATAAAAAAAAAEPQLQI,QHLLPAAPTTPAV 
ISNPINSIGPINQISSSSHPSNNNQQAVFEKAITISSIAIKRRPTLPQTP 
ASAPQVLSPSPKRQCAAAVSVLPVTVPVPVPVSVPLPVSVPVPVSVKGHP 
I SHTHQIAHTHQI SHSHPI SHPHHHQL SFAHPTQFAAAVAAHHQQQQQQQ 
AQQQQQAVQQQQQQAVQQQQVAYAVAASPQLQQQQQQQQHRLAQFNQAAA 
AALLNQHLQQQHQAQQQQHQAQQQSLAHYGGYQLHRYAPQQQQQHILLSS 
GSSSSKHNSNNNSNTSAGAASAAVPIATSVAAVPTTGGSLPDSPAHESHS 
HE SNS AT AS APTT P SP AG SVTS AAPTAT AT AAAAGS AAATAAATGTP ATS 
AVSDSNNNLNS SS SSNSNSNAIMENQMAIiAPLGIiSQSMDSVNT ASNEEEV 
RTLFVSGLPMDAKPRELYLLFRAYEGYEGSLLKVTSKNGKTASPVGFVTF 
HTRAGAEAAKQDLQGVRFDPDMPQTIRLEFAKSNTKVSKPKPQPNTATTA 
SHPAIjMHPLTGHIiGGPFFPGGPELWHHPLAYSAAAAAEI»PGAAALQHATL 

VHPALHPQVPVRSYI* 
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